Abstract

Voice assistants like Siri, Google Assistant and Alexa are used widely across the globe for home automation. They require the use of unique phrases, also known as hotwords, to wake them up and perform an action like “Hey Alexa!”, “Ok, Google!”, “Hey, Siri!”. These hotword detectors are lightweight real-time engines whose purpose is to detect the hotwords uttered by the user. However, existing engines require thousands of training samples or is closed source seeking a fee. This paper attempts to solve the same, by presenting the design and implementation of a lightweight, easy-to-implement hotword detection engine based on few-shot learning. The engine detects the hotword uttered by the user in real-time with just a few training samples of the hotword. This approach is efficient when compared to existing implementations because the process of adding a new hotword to the existing systems requires enormous amounts of positive and negative training samples, and the model needs to retrain for every hotword, making the existing implementations inefficient in terms of computation and cost. The architecture proposed in this paper has achieved an accuracy of 95.40%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.