Supervised I-vector modeling for language and accent recognition

Shreyas Ramoji,Sriram Ganapathy

doi:10.1016/j.csl.2019.101030

Shreyas Ramoji, Sriram Ganapathy

Open Access

https://doi.org/10.1016/j.csl.2019.101030

Copy DOI

Journal: Computer Speech & Language	Publication Date: Oct 11, 2019
Citations: 5	License type: publisher-specific-oa

Affiliation: Indian Institute of Science Bangalore

Abstract

The conventional i-vector approach to speaker and language recognition constitutes an unsupervised learning paradigm where a variable length speech utterance is converted into a fixed dimensional feature vector (termed as i-vector). The i-vector approach belongs to the broader family of factor analysis models where the utterance level adapted means of a Gaussian Mixture Model - Universal Background Model (GMM-UBM) are assumed to lie in a low rank subspace. The latent variables in the low rank model are assumed to have a standard Gaussian prior distribution. In this paper, we rework the theory of i-vector modeling in a supervised framework where the class labels (like language or accent) of the speech recordings are introduced directly into the i-vector model using a mixture Gaussian prior where each mixture component is associated with a class label. We provide the mathematical formulation for minimum mean squared error estimate (MMSE) of the supervised i-vector (s-vector) model. A detailed analysis of the s-vector model is given and this is contrasted with the traditional i-vector framework. The proposed model is used for language recognition tasks using the NIST Language Recognition Evaluation (LRE) 2017 dataset as well as an accent recognition task using the Mozilla common voices dataset. In these experiments, the s-vector model provides significant improvements over the conventional i-vector model (relative improvements of up to 24% for LRE task in terms of primary detection cost metric).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Supervised I-vector modeling for language and accent recognition

Abstract

Talk to us

Similar Papers

More From: Computer Speech & Language

Lead the way for us

Similar Papers

Supervised I-vector Modeling - Theory and Applications
Shreyas Ramoji ... Sriram Ganapathy
-
Shreyas Ramoji, et. al.Shreyas Ramoji ... Sriram Ganapathy
02 Sep 2018
02 Sep 2018

Bottleneck and Embedding Representation of Speech for DNN-based Language and Speaker Recognition
Alicia Lozano-Diez ... Joaquin Gonzalez-Rodriguez
-
Alicia Lozano-Diez, et. al.Alicia Lozano-Diez ... Joaquin Gonzalez-Rodriguez
21 Nov 2018
21 Nov 2018

Foreign accent detection from spoken Finnish using i-vectors
Hamid Behravan ... Ville Hautamäki
-
Hamid Behravan, et. al.Hamid Behravan ... Ville Hautamäki
25 Aug 2013
25 Aug 2013

Discriminative Universal Background Model Training for Speaker Recognition
Wei-Qiang Zhang ... Jia Liu
-
Wei-Qiang Zhang, et. al.Wei-Qiang Zhang ... Jia Liu
21 Jun 2011
21 Jun 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Supervised I-vector modeling for language and accent recognition

Abstract

Talk to us

Similar Papers

More From: Computer Speech &amp; Language

More From: Computer Speech & Language