Recognition of stop consonants in Japanese words using local spectral peaks.

Kyung Tae Kim,Ken'Iti Kido,Shozo Makino

doi:10.1250/ast.7.325

Kyung Tae Kim, Ken'Iti Kido + Show 1 more

Open Access

https://doi.org/10.1250/ast.7.325

Copy DOI

Abstract

This paper describes the experiments on recognition of stop consonants in continuous speech by use of local spectral peaks. The spectrum obtained through a band pass filter bank is sampled every 10 ms. The sampled spectrum is represented by a binary valued vector in which every element denotes the presence or absence of a local peak. The frequency distribution of local spectral peaks in 40 ms from the burst frame is transformed into the feature vector. And the conditional probability of the feature vector is used for the recognition. The experiments were carried out using 212 Japanese words uttered by 10 males and 10 females. The stop consonants are discriminated at a comparatively high rate using only the local spectral peaks. To improve the recognition rates, the differences between successive spectra are used. The recognition rates for the unvoiced stops for the 10 male and 10 female speakers are 82.6% and 79.1 % respectively, by the leaving-one-out experiments ; those for the voiced stops are 74.5 % and 66.1 %. It turns out that the local spectral peaks and the temporal changes in the spectra are very significant features for discriminating stop consonants.

Full Text