Yi Language Speech Recognition using Deep Learning Methods

Ziyan Chen,Hongwu Yang

doi:10.1109/itnec48623.2020.9084771

Abstract

Yi language is one of the most representative languages in the Yi branch of the Tibetan-Burmese language family. To strengthen the protection of Yi language, an endangered minority language, this paper covers the continuous speech recognition of Yi language with different deep learning methods to achieve the optimal recognition performance. In the training stage, we first analyzed the Yi language text to realize the language model modeling, and then trained the audio files based on the four acoustic models: hidden Markov model (HMM), deep neural network (DNN), time-delay neural network (TDNN) and end-to-end. In the recognition stage, after matching the language model according to the acoustic model, we obtained the Yi recognized text by combining the dictionary and acoustic feature vector for joint recognition and decoding. Compared with the word error rates of acoustic models, the time-delay neural network is the best in existed Yi corpus, only 16.65%.

Full Text