Abstract

When an Alexa built-in device is in playback mode, users need to make extra effort to cut through acoustic echo from the audio playback (music or synthesized speech signals) to activate the wakeword detector and start talking to Alexa, i.e., to barge into the conversation. These are high-friction events where Alexa customers typically experience a higher false reject rate (FRR). The acoustic echo canceller (AEC) and beamformer in the multichannel Audio Front End (AFE) are essential to mitigate the effect. But when playback volume is high, AFE outputs still contain strong residual echoes which downstream wakeword detection models could have difficulty to handle. In this talk, we will present an innovative multi-microphones yet array-agnostic, reference-free AEC tailored specifically for wakeword detection. As opposed to the traditional AEC, the new approach has five practically attractive benefits: (1) reference-free and array agnostic (a blind method suitable for in-model implementations); (2) inherently immune to loudspeaker nonlinearities; (3) free of synchronization/alignment hassle; (4) more microphones, more gains; (5) kills two birds with one stone (cancels both echo and noise). A proof-of-concept experiment will be discussed to validate the effectiveness of the proposed novel algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call