Abstract

This paper proposed an energy-efficient reconfigurable accelerator for keyword spotting (EERA-KWS) based on binary weight network (BWN) and fabricated in 28-nm CMOS technology. This keyword spotting system consists of two parts: the feature extraction based on melscale frequency cepstral coefficients (MFCC) and the keywords classification based on a BWN model, which is trained through the Google’s Speech Commands database and deployed on our custom. To reduce the power consumption while maintaining the system recognition accuracy, we first optimize the MFCC implementation with approximate computing techniques, including Pre-emphasis coefficient transformation, rectangular Mel filtering, Framing and FFT optimization. Then, we propose a precision self-adaptive reconfigurable accelerator with digital-analog mixed approximate computing units to process the BWN efficiently. Based on the SNR prediction of background noise and post-detection of network output confidence, the BWN accelerator data path can be dynamically and adaptively reconfigured as 4, 8, or 16 bits. For the BWN accelerator, we proposed a time-delay based addition unit to process bit-wise approximate computing for the convolution layers and fully connected layers, and a LUT based unit for the activation layers. Implemented under TSMC 28 nm HPC+ process technology, the estimated power is <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$77.8~\mu \text{W}~\sim ~115.9\mu \text{W}$ </tex-math></inline-formula> , the energy efficiency can achieve 163 TOPS/W, which is over <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$1.8\times $ </tex-math></inline-formula> better than the state-of-the-art architecture.

Highlights

  • The keyword spotting (KWS) system is used to automatically detect several particular keywords from a continuous stream of speech and has been utilized in many human-computer interaction applications, such as wearable devices, the Internet of Things (IoT), and so on

  • We proposed an energy-efficient reconfigurable accelerator for keyword spotting (EERA-KWS) based on binary weight network (BWN) using precision self-adaptive approximate computing

  • This keyword spotting system consists of two parts: the feature extraction based on Melscale Frequency Cepstral Coefficients (MFCC) and the keywords classification based on a BWN model, which is trained through the Google’s Speech Commands database and deployed on our custom

Read more

Summary

INTRODUCTION

The keyword spotting (KWS) system is used to automatically detect several particular keywords from a continuous stream of speech and has been utilized in many human-computer interaction applications, such as wearable devices, the Internet of Things (IoT), and so on. Different from the BNN adopted in work [7], where the addition operations are 1-incremental additions, in the BWN adopted for KWS in this work, the addition operations are X-incremental additions (0 ≤ X ≤ 2N , N is the bit width of the input data of each layer) To accelerate this BWN and make it energy efficient, we proposed a time-delay based addition unit architecture and its circuit implementations.

OVERALL SYSTEM ARCHITECTURE
IMPLEMENTATION OF PRECISION CONTROL MODULE
IMPLEMENTATION OF ACTIVATION LAYER WITH
Findings
CONCLUSIONS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.