Abstract
In this paper, we revisit the classical Singular Value Decomposition (SVD) based approach for dimension reduction in Language Identification (LID). This is proposed as an alternative to the state-of-the-art TVS based framework. A UBM-GMM is first built as in the state-of-the-art system. The training utterances are aligned with UBM using MAP adaptation to yield supervectors. The training supervectors are stacked row-wise to form a matrix. SVD is performed on this matrix of supervectors. The issue of ill-conditioned matrix is solved using a novel proxy projection technique. The supervectors are then projected along the top $\mathcal{L}$ singular vectors. An SVM-based classifier is trained on the projected supervectors. During testing, the test supervector obtained by aligning with the UBM-GMM is projected along the same $\mathcal{L}$ directions. The reduced dimension test vector is then classified using the SVM classifier. The proposed system shows an absolute improvement of 8.4% over the best i-vector based LID system for 30 second utterances of the CallFriend dataset with 12 languages. Proxy projection technique gives ≥3% absolute improvement over ordinary projection. As the T-matrix obtained in the TVS does not have orthogonal basis, the i-vectors are projected in orthogonal basis through SVD, which gives an absolute improvement of 6.4%. The proposed approach scales well with an accuracy of 93.87% on the Topcoder dataset with 176 languages.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.