Abstract

Prioritized grid long short-term memory (pGLSTM) has been shown to improve automatic speech recognition efficiently. In this paper, we implement this state-of-the-art model of ASR tasks for text-independent Chinese language speaker verification tasks in which DNN/i-Vector (DNN-based i-Vector) framework is adopted along with PLDA backend. To fully explore the performance, we compared the presented pGLSTM based UBM to GMM-UBM and HLSTM-UBM. Due to constraint of the amount of Chinese transcribed corpus for ASR training, we also explore an adaptation method by firstly training the pGLSTM-UBM on English language with large amount of corpus and use a PLDA adaptation backend to fit into Chinese language before the final speaker verification scoring. Experiments show that both pGLSTM-UBM model with corresponding PLDA backend and pGLSTM-UBM with adapted PLDA backend achieve better performance than the traditional GMM-UBM model. Additionally the pGLSTM-UBM with PLDA backend achieves performance of 4.94% EER in 5 s short utterance and 1.97% EER in 10 s short utterance, achieving 47% and 51% drop comparing to that of GMM. Experiment results imply that DNN from ASR tasks can expand the advantage of UBM model especially in short utterance and that better DNN model for ASR tasks could achieve extra gain in speaker verification tasks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.