Abstract

Nowadays Voice User Interfaces (VUIs) have become popular thanks to their easiness of use that makes them accessible to the elderly and people with disability. Nevertheless, their use in embedded systems for the realization of portable devices is limited by the computation complexity, the memory requirements and power consumption of the keyword spotting (KWS) algorithms, usually based on deep neural networks. In this paper we propose a new algorithm based on convolutional neural networks for the keyword spotting task, that offers a good trade-off among accuracy, power consumption and memory footprint. To select our proposed solution, we compared different neural network architectures to select the best trade-off of these metrics. For further improvements of these performances we implemented our solution on a dedicated hardware platform as Myriad 2 by Movidius. The use of this chip has reduced inference time and energy per inference by 50%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.