Abstract

SummaryDeep learning has made great achievements in the field of speech recognition. With the popularization of embedded devices such as intelligent speaker and the demand for dialect interaction scenes, it poses great challenges to far‐field speech recognition and dialect language recognition. In order to solve the dialect language recognition of embedded devices in far‐field speech recognition, we propose a deep learning neural network model with multitask learning. First, the audio is passed through the end‐to‐end noise reduction model to improve the effect of audio recognition. Then we define dialect recognition as the main task and dialect area as the auxiliary task, using the multitask learning method to improve the accuracy of dialect classification. The experimental results show that the end‐to‐end noise reduction model can improve the accuracy of audio recognition, and the best effect can be 7.54% higher than the baseline, and the accuracy of dialect language recognition can be improved by about 5% through multi task learning model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.