Abstract

A new method of representing phonemic categories and determining their standard values from a training sample distribution is presented. It is an essential part of a phoneme recognition system aiming at speaker-independent speech recognition. The phonemic value of a short-duration speech signal of up to 50 ms is represented by a matrix composed of acoustic parameters. Standard phonemic categories (SPC's) are defined by a combination of several simple potential functions in this matrix space. The potential function set, as well as its number, is determined automatically by the proposed method. Processing is primarily by algebraic operation and is formulated according to an analogy to particle dynamics. The method is applied to voiceless and voiced stop consonant sets spoken by twelve speakers. The relationship between the classification rate and the number of SPC's is investigated under several initial conditions. Stop consonant recognition tests in CV-syllables are made using derived SPC sets irrespective of following vowels. Recognition rates for the utterances of four speakers not included among the twelve speakers used for training were 84 percent for voiceless and 81 percent for voiced stops.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call