Abstract

The aim of this paper is to extract and select features from speech signal that will make it possible to have acceptable speaker recognition rate in real life. A variety of combinations among formants (F1, F2, F3), Linear Predictive Coefficients (LPC), Mel Frequency Cepstral Coefficients (MFCC) and deltaMel Frequency Cepstral Coefficients representing features are considered and their effect in speaker recognition is observed. Two similar volume data sets with differed string (words) are considered in the present study. These two data sets are prepared taking into account two differed data sampling rates. The study reveals another interesting fact that the selection of strings in speaker enrollment process is a matter of importance for accurate result. This means that the speaker will be tested for authentication with the same string with which he was enrolled earlier during the time of his first access to the system. General Terms Feature Extraction and Selection, Pattern Recognition, Artificial Neural Network, Automatic Speaker Recognition

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call