Abstract

The objective of this paper is to evaluate the effectiveness of complementary speech features extracted from a speaker for verification. Traditionally, speaker verification systems use a single feature for representing speaker-specific information. In this work extraction of segmental and suprasegmental features is proposed which shows a significant improvement in the performance of verification. The size and shape assumed by the vocal tract while producing various sound units is generated by Mel Frequency Cepstral Coefficient (MFCC) which is a segmental feature. Pitch information contributes to the uniqueness of the speaker's voice at the suprasegmental feature which spans for a longer duration than the frames used for short term spectral analysis. The scores obtained using MFCC and Pitch based systems are fused using a confidence measure. Speaker Verification experiments were carried out on the CHAINS corpus database. The equal error rate (EER) obtained for the MFCC system is 12.8%. The MFCC system outperforms the system based on Pitch alone. The integration MFCC and Pitch for speaker verification using a confidence measure gives an EER of 11.2%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call