Abstract

In this paper, a deep neural network (DNN) based integrated background noise suppression and acoustic modeling for speech recognition proposed in which on/off state of the motor for the head shaking robot is employed as the relevant auxiliary information of the DNN input. Since the motor sound being generated when the robot is moving or shaking its head severely degrades the performance of the speech recognition accuracy, we propose to use the motor on/off state as additional information when designing the DNN-based recognition system. Our speech recognition algorithm consists of two parts including the feature mapping model for feature enhancement and the acoustic model for phoneme recognition. As for the feature mapping, the stacked DNN is designed for the precise feature enhancement such that the lower DNN and upper DNN are trained separately and combined after which the motor state is plugged into both the lower DNN and upper DNN in addition to the input noisy speech. Then, the acoustic model is trained upon the feature enhancement model in which the motor state is again used as the augmented feature. The proposed technique to suppress the acoustic and motor noises was evaluated in term of the phoneme error rate (PER) and showed a significant improvement over the conventional system.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.