Abstract

ABSTRACT This paper aims to provide different approaches to text dependent speaker identification using various transformation techniques such as DCT, Walsh and Haar transform along with use of spectrograms. Set of spectrograms obtained from speech samples is used as image database for the study undertaken. This image database is then subjected to various transforms. Using Euclidean distance as measure of similarity, most appropriate speaker match is obtained which is declared to be identified speaker. Each transform is applied to spectrograms in two different ways: on full image and on Row Mean of an image. In both the ways, effect of different number of coefficients of transformed image is observed. Further, comparison of all three transformation techniques on spectrograms in both the ways shows that numbers of mathematical computations required for Walsh transform is much lesser than number of mathematical computations required in case of DCT on spectrograms. Whereas, use of Haar transform on spectrograms drastically reduces the number of mathematical computation with almost equal identification rate. Transformation techniques on Row Mean give better identification rate than transformation technique on full image.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call