Abstract

A novel Discriminative Vector Quantization method for Speaker Identification (DVQSI) is proposed, and its parameters selection is discussed. In the training mode of this approach, the vector space of speech features is divided into a number of regions. Then, a Vector Quantization (VQ) codebook for each speaker in each region is constructed. For every possible combination of speaker pairs, a discriminative weight is assigned for each region, based on the region's ability to discriminate between the speaker pair. Consequently, the region, which contains a larger distribution difference between the speech feature vector sets of the two speakers in the speaker pair, plays a more important role by assigning it a larger discriminative weight, in identifying the better speaker match from the two speakers. In the testing mode, to identify an unknown speaker, discriminative weighted average VQ distortion pairs are computed for the unknown speaker input waveform. Then, a technique is described that figures out the best match between the unknown waveform and speakers' templates. The proposed DVQSI approach can be considered a generalization of the existing VQ technique for Speaker Identification (VQSI). The method presented here yields better Speaker Identification (SI) accuracy by employing the discriminative weights and space segmentation as design parameters. This is confirmed experimentally. In addition, a computationally efficient implementation of the DVQSI technique is given which uses a tree-structured-like approach to obtain the codebooks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call