&lt;title&gt;Hybrid vector quantization/neural tree network classifiers for speaker recognition&lt;/title&gt;

Kevin R Farrell ,Richard J Mammone

doi:10.1117/12.191879

Abstract

A new classification system for text-independent speaker recognition is presented. Text- independent speaker recognition systems generally model each speaker with a single classifier. The traditional methods use unsupervised training algorithms, such as vector quantization (VQ), to model each speaker. Such methods base their decision on the distortion between an observation and the speaker model. Recently, supervised training algorithms, such as neural networks, have been successfully applied to speaker recognition. Here, each speaker is represented by a neural network. Due to their discriminative training, neural networks capture the differences between speakers and use this criteria for decision making. Hence, the output of a neural network can be considered as an interclass measure. The VQ classifier, on the other hand, uses a distortion which is independent of the other speaker models, and can be considered as an intraclass measure. Since these two measures are based on different criteria, they can be effectively combined to yield improved performance. This paper uses data fusion concepts to combine the outputs of the neural tree network and VQ classifiers. The combined system is evaluated for text-independent speaker identification and verification and is shown to outperform either classifier when used individually.

Full Text