Curriculum Learning based approaches for robust end-to-end far-field speech recognition

Shivesh Ranjan,John H.L Hansen

doi:10.1016/j.specom.2021.06.003

Abstract

Performance of Automatic Speech Recognition (ASR) systems is known to suffer considerable degradation when exposed to Far-Field speech data capture. Consequently, far-field ASR has received considerable attention in recent years. Motivated by our recent work using Curriculum Learning (CL) based strategies to improve Speaker Identification (SID) under noisy and degraded conditions, this study proposes a novel approach to improve far-field ASR using CL based approaches. Specifically, we propose using a CL based approach for training a Bidirectional Long Short Term Memory (BLSTM) based ASR network trained using the Connectionist Temporal Classification (CTC) objective function. We initiate the training with comparatively easier near-field data, and include more diverse (difficult) far-field data progressively in the later stages of training. These proposed approaches are shown to significantly outperform the baseline BLSTM ASR system, and offer relative reductions in WERs of up to +7.3% and +10.1% for the dev and eval sets of the AMI far-field voice capture corpus.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Curriculum Learning based approaches for robust end-to-end far-field speech recognition

Abstract

Talk to us

Similar Papers

More From: Speech Communication

Lead the way for us

Journal: Speech Communication	Publication Date: Jun 18, 2021
Citations: 3

Similar Papers

Combined speech enhancement and auditory modelling for robust distributed speech recognition
Ronan Flynn ... Edward Jones
Speech Communication | VOL. 50
Ronan Flynn, et. al.Ronan Flynn ... Edward Jones
20 May 2008
Speech Communication | VOL. 50

Analyzing Uncertainties in Speech Recognition Using Dropout
Apoorv Vyas ... Sibo Tong
-
Apoorv Vyas, et. al.Apoorv Vyas ... Sibo Tong
01 May 2019
01 May 2019

Autocorrelation-based Methods for Noise-Robust Speech Recognition
Gholamreza Farahani ... Mohammad Mehdi
-
Gholamreza Farahani, et. al.Gholamreza Farahani ... Mohammad Mehdi
01 Jun 2007
01 Jun 2007

Non-native pronunciation variation modeling using an indirect data driven method
Mina Kim ... Yoo Rhee Oh
-
Mina Kim, et. al. Mina Kim ... Yoo Rhee Oh
01 Jan 2007
01 Jan 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Curriculum Learning based approaches for robust end-to-end far-field speech recognition

Abstract

Talk to us

Similar Papers

More From: Speech Communication