The Research of Chain Model Based on CNN-TDNNF in Yulin Dialect Speech Recognition

Mengen Zhai,Feifan Yu,Yi Qin,Lihong Dong

doi:10.1109/icivc55077.2022.9886397

Abstract

As a local dialect in China, the Yulin dialect is relatively thinly studied in speech recognition, and the available corpus data is relatively small. In this paper, we collect local dialect speech and corpus, build a speech database, analyze its pronunciation characteristics, and build a dictionary corresponding to the dialect with dialect vowels and rhymes as the base element. Thus, the problem of low recognition performance of speech recognition system under dialects, accents, and low resource corpus is solved. Firstly, this paper uses velocity perturbation as a data enhancement scheme to increase the information contained in the input features during feature extraction. Secondly, CNN-TDNNF, capable of long time series training, is used in the model as a neural network combined with the N-gram language model. Finally, the experimental results show that the performance of this scheme is improved by 15.42% compared with the traditional dialect speech recognition system in a dialect environment with a low resource corpus.

Full Text