Abstract

The objective is to model the dominating speakerspecific source in the time-domain at different levels, namely, Subsegmental, segmental and supra-segmental. The speaker-specific source information contained in the LP residual. Hence, LP residual contains different speaker-specific information at different levels. At each level features are extracted using proposed method called Hidden Markov models (HMM) and it is compared with existing Gaussian Mixture model (GMM). The experimental results demonstrates that the performance of Subsegmental level is more than the other two levels. However, the evidences from all the three levels of processing seem to be different and combine well to provide improved performance than the state-of –art speaker recognition system and demonstrating different speaker information captured at each level of processing. Finally, the combined evidence from all the three levels of processing together with vocal tract information further improves the speaker recognition performance. Experiments were conducted on TIMIT database using Gaussian Mixture Models (GMM’s) and Hidden Markov models (HMM’s). Comparing both results the proposed model HMM is better than the existing model GMM.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.