Abstract

Speech recognition is an important research field in natural language processing. In Chinese and English, which have rich data resources, the performance of end-to-end speech recognition model is close to that of Hidden Markov Model—Deep Neural Network (HMM-DNN) model. However, for the low resource speech recognition task of Chinese English hybrid, the end-to-end speech recognition system does not achieve good performance. In the case of limited mixed data between Chinese and English, the modeling method of end-to-end speech recognition is studied. This paper focuses on two end-to-end speech recognition models: connection timing distribution and attention based codec network. In order to improve the performance of Chinese English hybrid speech recognition, this paper studies how to improve the performance of the coder based on connection timing distribution model and attention mechanism, and tries to combine the two models to improve the performance of Chinese English hybrid speech recognition. In low resource Chinese English mixed data, the advantages of different models are used to improve the performance of end-to-end models, so as to improve the recognition accuracy of speech recognition technology in legal Chinese English simultaneous interpretation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call