Abstract

The recently introduced m-vector approach uses Maximum Likelihood Linear Regression (MLLR) super-vectors for speaker verification, where MLLR super-vectors are estimated with respect to a Universal Background Model (UBM) without any transcription of speech segments and speaker m-vectors are obtained by uniform segmentation of their MLLR super-vectors. Hence, this approach does not exploit the phonetic content of the speech segments. In this paper, we propose the integration of an Automatic Speech Recognition (ASR) based multi-class MLLR transformation into the m-vector system. We consider two variants, with MLLR transformations computed either on the 1-best (hypothesis) or on the lattice word transcriptions. The former case is able to account for the risk of ASR transcription errors. We show that the proposed systems outperform the conventional method over various tasks of the NIST SRE 2008 core condition.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.