Abstract

Speaker recognition over a public telephone network involves various types of transmission channels and handsets, which leads to mismatched channels (between the enrolled models and the test utterances), and hence to a significant decline in the speaker recognition performance. In this paper a cohort-based speaker model synthesis algorithm, which aims at synthesizing speaker models for channels where no enrollment data is available is proposed. This algorithm applies a priori knowledge of channels extracted from speaker-specific cohort sets to synthesize speaker models. Results for the China Criminal Police College (CCPC) speaker recognition corpus, which contains utterances from both a landline and a mobile channel, show significant improvements over the HT-Norm and UBM-based speaker model synthesis algorithms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.