Abstract

Due to the difficulties in the recognition of quite similar utterances, such as CV syllables with the same vowel, a two‐step approach was proposed. In the first step the normalized log energy, 32 spectral band log energies, and the spectral change were used through a DP algorithm to determine: (a) one of the five broad acoustic classes of the consonant involved (voiced stop, unvoiced stop, nasal, liquid, and fricative) into which the best matched syllable fell and (b) the warping functions. As a second step the test pattern was compared with the reference patterns of the previous recognized class, emphasizing the differentiating regions so as to realize the final recognition of the syllable. The patterns were matched over the warping function taking into account only the frames around the transitional region. The final distances were calculated using only the spectral bands which focused the acoustic class distinctive features. Speaker‐dependent performance over the ten more frequent Spanish CV syllables was improved from 78% to 99% with the two‐step procedure instead of considering only the first step as final recognition of the syllable.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call