Abstract

Method and results are presented for the application of statistical pattern-recognition techniques to automatic classification of words and talkers using formant-tracking data as the input. The method employs the expression of an entire formant pattern as a single n-dimensional vector. Each vector component represents a sample of one of the formant functions at a giventime. Thus, the dimensionality of the n space is determined by the product of the number of formant parameters sampled and the number of observation epochs. A computer program processes these data in two steps. In the first step, called a “learning” program, separate statistical categories are established by the computation of the mean vector and covariance matrix describing a normal distribution function for the set of vectors that define each category. In the second step, called the “recognition” program, an unknown is recognized by computing the statistical distance to all possible categories and selecting the category for which this distance is minimum. Results, in the form of confusion matrices, are presented for word recognition of the first ten letters of the alphabet as spoken by 10 talkers and for recognition of the talkers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call