Text‐independent speaker verification using linear predictive hidden Markov models

Naftali Tishby

doi:10.1121/1.2024482

Abstract

The application of hidden Markov models to text‐independent speaker verification was studied. Linear predictive hidden Markov models have been proved to be an efficient way for statistical modeling of speech signals in all the forms of speech recognition. As was already suggested by Alan Poritz, such models can be used for statistical characterization of the talker himself. In our case, ergodic Markov models with four‐seven states, are shown to discriminate between speakers, where the spectral density of the states is characterized by eight order LPC coefficients. A large data base of 100 talkers with about 20 000 isolated digits was used. The states, as well as the optimal model dimensionality, tend to supply enough information to verify the talker with less than 3% error rate, on sufficiently large testing data. The results show, however, that most of the speaker‐dependent information is contained in the spectral densities and very little is left in the transition matrix. This may justify the vector quantization approach to the problem.

Full Text