Abstract

Ultra-low power is a strong requirement for always-on speech interfaces in wearable and mobile devices, such as Voice Activity Detection (VAD) and Keyword Spotting (KWS) [1]–[5]. A KWS system is used to detect specific wake-up words by speakers and has to be always on. Previous ASICs for KWS lack energy-efficient implementations having power $ . For example, deep neural network (DNN)-based KWS [1] has a large on-chip weight memory of 270KB and consumes $288\mu \mathrm{W}$ . A binarized convolutional neural network (CNN) used 52KB of SRAM, $141\mu \mathrm{W}$ wakeup power at 2.5MHz, 0.57V [2]. An LSTM-based SoC used 105KB of SRAM and reduced power to $16.11\mu\mathrm{W}$ for KWS with 90.8% accuracy on the Google Speech Command Dataset (GSCD) [3]. Laika reduced power to $5\mu \mathrm{W}$ [4], not including the Mel Frequency Cepstrum Coefficient (MFCC) circuit. High compute and memory requirements have prevented always-on KWS chips from operating in the $\mathrm{sub}-\mu \mathrm{W}$ range.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.