Abstract
An automatic, text-independent speaker verification (SV) system is proposed using Line Spectral Frequency (LSF) features. The state-of-the-art Gaussian Mixture Model with Universal Background Model (GMM-UBM) framework is used for speaker modeling and verification. A score-level fusion based technique is employed in order to extract complementary information from static and dynamic LSF features and improve the noise-robustness of the SV system. In addition, the speaker-discriminative power of different speech zones such as vowels, non-vowels, and transitions are investigated. Rapidly varying transition regions of speech are found to be most speaker-discriminative in high SNR conditions. Steady, high-energy vowel regions are robust against noise and are most speaker-discriminative in low SNR conditions. We show that selectively utilizing features from a combination of transition and steady vowel zones further improves the performance of the score-level fusion based SV system under noisy conditions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.