Abstract

The nonlinear dynamic characteristics of expansion and contraction and the sequential time-varying features of the syllable pronunciations greatly complicate the tasks of automatic speech recognition. Each syllable is represented by a sequence of vectors of linear predict coding cepstra (LPCC). Even if the same speaker utters the same syllable, the duration of stable parts of the sequence of LPCC vectors changes every time. Therefore, the duration of stable parts is contracted such that the compressed speech waveform has the same length. We propose five different simple techniques to contract the stable parts of the sequence of LPCC vectors. A simplified Bayes decision rule with a weighted variance is used to classify 408 speaker-dependent mandarin syllables. For the 408 speaker-dependent mandarin syllables, the recognition rate is 94.36% as compared with 79.78% obtained by using the hidden Markov models (HMM). A recognition rate 98.16% is achieved within top 3 candidates. The features proposed in this paper to represent the syllables are simple and easy to be extracted. The computation for feature extraction and classification is much faster than using the techniques of the HMM or any other known techniques.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.