Abstract
The keyword spotting (KWS) system is one of the most important interfaces between humans and machines since it is usually the start of automatic speech recognition and natural language processing techniques. However, for KWS hardware, it is still a problem to make one specified chip both low power and high performed under multiple scenarios, such as in meeting rooms, on different traffic or in parks and so on, for different scenarios own wide range signal-noise-ratios (SNRs). The problem leads to the requirements of balanced design between KWS system accuracy and the hardware cost under various noise types and levels. To overcome the balanced design and tradeoff problems, a complete KWS processor including an Mel-Frequency Cepstrum Coefficients (MFCC) feature extractor and a quantized Convolutional Neural Network (QCNN) accelerator is proposed for wide SNR range and low-power KWS in this paper. Firstly, the approach to quantize CNNs into QCNNs with high accuracy is proposed with considerations of hardware-software tradeoff. With the tradeoff of KWS system accuracy and hardware cost, the 4bit/8bit dual-working-mode strategy is proposed to keep low hardware cost and high accuracy under different scenarios. To be specific, the training, tuning and validating of the CNNs and QCNNs are taken with the dataset of 10 keywords chosen from the Google Command Speech Dataset (GCSD). Secondly, a serial FFT based MFCC extractor is implemented with low power and small footprint. Finally, with a novel hybrid reuse strategy of input data and network weight, a reconfigurable and approximate computing based QCNN accelerator is designed. Implemented and verified under TSMC 22nm ULL technology, with the area of 1.42mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> , the QCNN accelerator can achieve 5.26μW/9.08μW power consumption in 4bit/8bit work mode with accuracy of 88% and 93% respectively, which is superior to the state-of-the-art processors.
Highlights
The Internet of Things (IoT) and the Artificial Intelligence technologies are developing and merging for recently years as the AIoT, short for Artificial Intelligence and Internet of Things
EVALUATIONS AND OPTIMIZATIONS OF quantized Convolutional Neural Network (QCNN) FOR MULTI-SCENORIO keyword spotting (KWS) SYSTEM From the previous experiments and analysis, it can be concluded that the network structure using 4 convolutional layers and 2 fully-connected layers, under 8-bit and 4-bit quantization methods, are with the recognition accuracy on test set of 93.2% and 88.3%
In this paper, a complete KWS processor including Mel-Frequency Cepstrum Coefficients (MFCC) extractor and QCNN accelerator is proposed for wide SNR range low-power keyword spotting
Summary
The Internet of Things (IoT) and the Artificial Intelligence technologies are developing and merging for recently years as the AIoT, short for Artificial Intelligence and Internet of Things. We designed and implemented a complete KWS system including a low power serial FFT based Mel Frequency Cepstrum Coefficients (MFCC) [7] extractor and a reconfigurable and approximate computing based Quantized Convolutional Neural Network (QCNN) accelerator. The Quantized CNNs (QCNNs) are realized with the approach and the tradeoffs between system accuracy and hardware consumption are evaluated; 2) A novel hybrid data-weight reuse strategy along with the reconfigurable datapath architecture and the approximate processing elements designs are introduced, which can make full use of on-chip memory with the proposed QCNN; VOLUME 8, 2020. The accelerator consists of an MFCC extractor and a QCNN accelerator, and achieves high energy efficiency and gain high accuracy under different SNRs. To clearly introduce our work on KWS system, the rest parts of this paper are organized as followed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.