Non-Negative Factor Analysis of Gaussian Mixture Model Weight Adaptation for Language and Dialect Recognition

Mohamad Hasan Bahari,Lukas Burget,Najim Dehak,Ahmed M Ali,Jim Glass,Hugo Van Hamme

doi:10.1109/taslp.2014.2319159

Abstract

Recent studies show that Gaussian mixture model (GMM) weights carry less, yet complimentary, information to GMM means for language and dialect recognition. However, state-of-the-art language recognition systems usually do not use this information. In this research, a non-negative factor analysis (NFA) approach is developed for GMM weight decomposition and adaptation. This modeling, which is conceptually simple and computationally inexpensive, suggests a new low-dimensional utterance representation method using a factor analysis similar to that of the i-vector framework. The obtained subspace vectors are then applied in conjunction with i-vectors to the language/dialect recognition problem. The suggested approach is evaluated on the NIST 2011 and RATS language recognition evaluation (LRE) corpora and on the QCRI Arabic dialect recognition evaluation (DRE) corpus. The assessment results show that the proposed adaptation method yields more accurate recognition results compared to three conventional weight adaptation approaches, namely maximum likelihood re-estimation, non-negative matrix factorization, and a subspace multinomial model. Experimental results also show that the intermediate-level fusion of i-vectors and NFA subspace vectors improves the performance of the state-of-the-art i-vector framework especially for the case of short utterances.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Non-Negative Factor Analysis of Gaussian Mixture Model Weight Adaptation for Language and Dialect Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Jul 1, 2014
Citations: 68

Similar Papers

Bottleneck and Embedding Representation of Speech for DNN-based Language and Speaker Recognition
Alicia Lozano-Diez ... Joaquin Gonzalez-Rodriguez
-
Alicia Lozano-Diez, et. al.Alicia Lozano-Diez ... Joaquin Gonzalez-Rodriguez
21 Nov 2018
21 Nov 2018

Supervised I-vector modeling for language and accent recognition
Shreyas Ramoji ... Sriram Ganapathy
Computer Speech & Language | VOL. 60
Shreyas Ramoji, et. al.Shreyas Ramoji ... Sriram Ganapathy
11 Oct 2019
Computer Speech & Language | VOL. 60

Homogenous ensemble phonotactic language recognition based on SVM supervector reconstruction
Wei-Wei Liu ... Wei-Qiang Zhang
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2014
Wei-Wei Liu, et. al.Wei-Wei Liu ... Wei-Qiang Zhang
01 Dec 2014
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2014

Relevance factor of maximum a posteriori adaptation for GMM–NAP–SVM in speaker and language recognition
Chang Huai You ... Kong Aik Lee
Computer Speech & Language | VOL. 30
Chang Huai You, et. al.Chang Huai You ... Kong Aik Lee
29 Sep 2014
Computer Speech & Language | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Non-Negative Factor Analysis of Gaussian Mixture Model Weight Adaptation for Language and Dialect Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing