Frame-by-frame language identification in short utterances using deep neural networks

Javier Gonzalez-Dominguez,Ignacio Lopez-Moreno,Pedro J Moreno,Joaquin Gonzalez-Rodriguez

doi:10.1016/j.neunet.2014.08.006

Javier Gonzalez-Dominguez, Ignacio Lopez-Moreno + Show 2 more

Open Access

https://doi.org/10.1016/j.neunet.2014.08.006

Copy DOI

Abstract

This work addresses the use of deep neural networks (DNNs) in automatic language identification (LID) focused on short test utterances. Motivated by their recent success in acoustic modelling for speech recognition, we adapt DNNs to the problem of identifying the language in a given utterance from the short-term acoustic features. We show how DNNs are particularly suitable to perform LID in real-time applications, due to their capacity to emit a language identification posterior at each new frame of the test utterance. We then analyse different aspects of the system, such as the amount of required training data, the number of hidden layers, the relevance of contextual information and the effect of the test utterance duration. Finally, we propose several methods to combine frame-by-frame posteriors. Experiments are conducted on two different datasets: the public NIST Language Recognition Evaluation 2009 (3 s task) and a much larger corpus (of 5 million utterances) known as Google 5M LID, obtained from different Google Services. Reported results show relative improvements of DNNs versus the i-vector system of 40% in LRE09 3 second task and 76% in Google 5M LID.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Neural networks : the official journal of the International Neural Network Society	Publication Date: Sep 3, 2014
Citations: 50	License type: other-oa

R Discovery Prime

R Discovery Prime

Frame-by-frame language identification in short utterances using deep neural networks

Abstract

Talk to us

Similar Papers

More From: Neural networks : the official journal of the International Neural Network Society

Lead the way for us

Similar Papers

Applying compensation techniques on i-vectors extracted from short-test utterances for speaker verification using deep neural network
Il-Ho Yang ... Hee-Soo Heo
-
Il-Ho Yang, et. al.Il-Ho Yang ... Hee-Soo Heo
01 Mar 2017
01 Mar 2017

Language Identification in Short Utterances Using Long Short-Term Memory (LSTM) Recurrent Neural Networks.
Ruben Zazo ... Ian Mcloughlin
PloS one | VOL. 11
Ruben Zazo, et. al.Ruben Zazo ... Ian Mcloughlin
29 Jan 2016
PloS one | VOL. 11

On the use of deep feedforward neural networks for automatic language identification
Ignacio Lopez-Moreno ... Pedro J Moreno
Computer Speech & Language | VOL. 40
Ignacio Lopez-Moreno, et. al.Ignacio Lopez-Moreno ... Pedro J Moreno
06 May 2016
Computer Speech & Language | VOL. 40

Factorized Hidden Variability Learning for Adaptation of Short Duration Language Identification Models
Sarith Fernando ... Eliathambv Ambikairajah
-
Sarith Fernando, et. al.Sarith Fernando ... Eliathambv Ambikairajah
01 Apr 2018
01 Apr 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Frame-by-frame language identification in short utterances using deep neural networks

Abstract

Talk to us

Similar Papers

More From: Neural networks : the official journal of the International Neural Network Society