Abstract

Traditional frequency warping algorithms in speech recognition mainly consider the spectrum offset caused by different vocal tract length. But in addition to the vocal tract length, the pronunciation diversity is also affected by the glottis difference. At the same time, the difference mentioned above is also interconnected. Although piecewise linear frequency warping considers the effect of the glottis and vocal tract, it will lose some information because it is not continuous and it only considers the effect of single factor. This paper proposes a two factor frequency warping algorithm based on the weighted fusion of the glottis resonant factor and the third formant factor. Compared with piecewise linear frequency warping, the method of spectrum aligned in this algorithm is smoother and takes the relationship between the glottis and vocal tract into account. Experimental results show that the recognition rate of the proposed algorithm is improved in the case of training data and testing data both matching and mismatching.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call