Abstract

In this work, we explore the use of sparse features derived using a learned dictionary for language recognition (LR). These sparse features are referred to as s-vector and are derived by sparse coding of the commonly used low-dimensional i-vector based representation of speech utterances over the learned dictionary. The orthogonal matching pursuit (OMP), least absolute shrinkage and selection operator (LASSO), and elastic net (ENet) based sparse coding algorithms have been investigated for deriving the s-vectors. Two classifiers namely cosine distance scoring (CDS) and support vector machine (SVM) have been applied on the s-vectors. Scores are calibrated using regularized multi-class logistic regression. The effectiveness of the proposed approach is empirically validated on NIST 2007 LRE data set in closed set condition on 30 seconds duration segments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.