I-Vector estimation as auxiliary task for Multi-Task Learning based acoustic modeling for automatic speech recognition

Gueorgui Pironkov,Stephane Dupont,Thierry Dutoit

doi:10.1109/slt.2016.7846237

Abstract

I-Vectors have been successfully applied in the speaker identification community in order to characterize the speaker and its acoustic environment. Recently, i-vectors have also shown their usefulness in automatic speech recognition, when concatenated to standard acoustic features. Instead of directly feeding the acoustic model with i-vectors, we here investigate a Multi-Task Learning approach, where a neural network is trained to simultaneously recognize the phone-state posterior probabilities and extract i-vectors, using the standard acoustic features. Multi-Task Learning is a regularization method which aims at improving the network's generalization ability, by training a unique network to solve several different, but related tasks. The core idea of using i-vector extraction as an auxiliary task is to give the network an additional inter-speaker awareness, and thus, reduce overfitting. Overfitting is a commonly met issue in speech recognition and is especially impacting when the amount of training data is limited. The proposed setup is trained and tested on the TIMIT database, while the acoustic modeling is performed using a Recurrent Neural Network with Long Short-Term Memory cells.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

I-Vector estimation as auxiliary task for Multi-Task Learning based acoustic modeling for automatic speech recognition

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Speaker-aware Multi-Task Learning for automatic speech recognition
Gueorgui Pironkov ... Stephane Dupont
-
Gueorgui Pironkov, et. al.Gueorgui Pironkov ... Stephane Dupont
01 Dec 2016
01 Dec 2016

Speaker-aware long short-term memory multi-task learning for speech recognition
Gueorgui Pironkov ... Thierry Dutoit
-
Gueorgui Pironkov, et. al.Gueorgui Pironkov ... Thierry Dutoit
01 Aug 2016
01 Aug 2016

Using Auxiliary Sources of Knowledge for Automatic Speech Recognition

-

01 Jan 2004
01 Jan 2004

Investigating the impact of the training data volume for robust speech recognition using multi-task learning
Gueorgui Pironkov ... Thierry Dutoit
-
Gueorgui Pironkov, et. al.Gueorgui Pironkov ... Thierry Dutoit
01 Dec 2017
01 Dec 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

I-Vector estimation as auxiliary task for Multi-Task Learning based acoustic modeling for automatic speech recognition

Abstract

Talk to us

Similar Papers