Abstract
This paper describes the experiments on recognition of stop consonants in continuous speech by use of local spectral peaks. The spectrum obtained through a band pass filter bank is sampled every 10 ms. The sampled spectrum is represented by a binary valued vector in which every element denotes the presence or absence of a local peak. The frequency distribution of local spectral peaks in 40 ms from the burst frame is transformed into the feature vector. And the conditional probability of the feature vector is used for the recognition. The experiments were carried out using 212 Japanese words uttered by 10 males and 10 females. The stop consonants are discriminated at a comparatively high rate using only the local spectral peaks. To improve the recognition rates, the differences between successive spectra are used. The recognition rates for the unvoiced stops for the 10 male and 10 female speakers are 82.6% and 79.1 % respectively, by the leaving-one-out experiments ; those for the voiced stops are 74.5 % and 66.1 %. It turns out that the local spectral peaks and the temporal changes in the spectra are very significant features for discriminating stop consonants.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.