Abstract

In this paper, a customized wake-up word system combined with key word spotting using neural network was proposed. This system is divided into three phases: training wake-up word phase, detecting wake-up word phase and key word spotting phase. In training phase, user can say any word in any language and system will automatically count how many syllable of this word. If several syllables are in the range, system will accept this customized wake-up word. Next, the word will be extracted the features by Mel-Frequency Cepstral Coefficients (MFCC) method. It can be used for speaker model, speech model and state sequence for next phase. In detecting phase, system detects an unknown voice segment and compares it with models. After these steps, system will determine to wake up or not. If user says the right wake-up word, system goes to next phase. In key word spotting phase, the command words are fixed. The system is designed using convolutional neural network for key word spotting model. Moreover, all processes are executed without Internet to protect user privacy. This system can give a good result with a very small amount of wake-up word training data, and run in real-time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call