Abstract

Automatic speech recognition (ASR) has become a core technology for mobile devices. Delivering real-time and accurate ASR has a huge computational cost, which is challenging to achieve in tightly energy-constrained platforms such as mobile devices. A state-of-the-art ASR pipeline consists of a deep neural network (DNN) that converts the audio signal into phonemes' probabilities, followed by a Viterbi search that uses these probabilities to generate a sequence of words. In this article, the authors propose an ASR system for low-power devices that combines a mobile GPU for the DNN with a dedicated hardware accelerator for the Viterbi search. DNN evaluation is easy to parallelize and, hence, it achieves high energy efficiency on a mobile GPU. On the other hand, the Viterbi search is difficult to parallelize, and it represents the main bottleneck for ASR, so the authors propose a hardware accelerator to dramatically reduce its energy requirements while increasing performance. Their proposal outperforms traditional solutions running on the CPU by orders of magnitude. Compared to a GPU-only system, their hybrid scheme combining the GPU and the accelerator improves performance by 5.25 times, while reducing energy by 2.05 times.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.