Abstract

Beamforming is an essential tool for speaker selection and rejection of environmental noise in automatic speech recognition. This work harnesses the efficiency of delay-and-sum (DAS) beamforming by combining it with constant-directivity beamforming (CDB) and frequency-domain feature extraction. CDB facilitates DAS by restricting the bandwidth for different microphone configurations. An array of sigma-delta modulators (SDMs) digitizes eight microphone inputs. The design takes advantage of bitstream processing of the modulator outputs for beamforming and extracting 60 Mel spectrum power features. The prototype device is fabricated in the 40-nm CMOS and occupies 1.1 mm². Each SDM consumes 91 mW and has a measured signal-to-noise and distortion ratio of 84 dB for an 8-kHz bandwidth. The beamformer and feature extractor consume a dynamic power of 76 and 122 mW, respectively. The entire power consumption of the prototype is 3.95 mW, including leakage power. Processing the Mel spectrum outputs with a DNN, the keyword spotting accuracy in the presence of noise improves from 74% without beamforming to 93% with beamforming.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.