Abstract

The speech features used for speaker recognition should uniquely reflect characteristics of the speaker's vocal tract apparatus and contain negligible information about the linguistic contents in the speech. Cepstral features such as Linear Predictive Spectral Coefficients (LPCCs) and Mel Frequency Cepstral Coefficients (MFCCs) are most commonly used features for speaker recognition task, but found to be sensitive to noise and distortion. Other complementary features used initially for speech recognition can be found useful for speaker recognition task. In this work, Line Spectral Pair (LSP) features (derived from baseline linear predictive coefficients) are used for text independent speaker identification. In LSP features, power spectral density at any frequency tends to depend only on close to the respective LSP. In contrast, for cepstral features, changes in particular parameter affects the whole spectrum. The goal here is to investigate the performance of line spectral pair (LSP) features against conventional cepstral features in the presence of acoustic disturbance. Experimentation is carried out using TIMIT and NTIMIT dataset to analyze the performance in case of acoustic and channel distortions. It is observed that the LSP features perform equally well to conventional cepstral features on TIMIT dataset and have showed enhanced identification results on NTIMIT datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.