Improving robustness against reverberation for automatic speech recognition

Vikramjit Mitra,Julien Van Hout,Wen Wang,Horacio Franco,Martin Graciarena,Mitchell Mclaren,Dimitra Vergyri

doi:10.1109/asru.2015.7404840

Abstract

Reverberation is a phenomenon observed in almost all enclosed environments. Human listeners rarely experience problems in comprehending speech in reverberant environments, but automatic speech recognition (ASR) systems often suffer increased error rates under such conditions. In this work, we explore the role of robust acoustic features motivated by human speech perception studies, for building ASR systems robust to reverberation effects. Using the dataset distributed for the Automatic Speech Recognition In Reverberant Environments (ASpIRE-2015) challenge organized by IARPA, we explore Gaussian mixture models (GMMs), deep neural nets (DNNs) and convolutional deep neural networks (CDNN) as candidate acoustic models for recognizing continuous speech in reverberant environments. We demonstrate that DNN-based systems trained with robust features offer significant reduction in word error rates (WERs) compared to systems trained with baseline mel-filterbank features. We present a novel time-frequency convolution neural net (TFCNN) framework that performs convolution on the feature space across both the time and frequency scales, which we found to consistently outperform the CDNN systems for all feature sets across all testing conditions. Finally, we show that further WER reduction is achievable through system fusion of n-best lists from multiple systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving robustness against reverberation for automatic speech recognition

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Theoretical Analysis of Diversity in an Ensemble of Automatic Speech Recognition Systems
Kartik Audhkhasi ... Shrikanth S Narayanan
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22
Kartik Audhkhasi, et. al.Kartik Audhkhasi ... Shrikanth S Narayanan
01 Mar 2014
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22

CENSREC-4: An evaluation framework for distant-talking speech recognition in reverberant environments
Takahiro Fukumori ... Satoshi Nakamura
Acoustical Science and Technology | VOL. 32
Takahiro Fukumori, et. al.Takahiro Fukumori ... Satoshi Nakamura
01 Jan 2010
Acoustical Science and Technology | VOL. 32

Deep convolutional nets and robust features for reverberation-robust speech recognition
Vikramjit Mitra ... Wen Wang
-
Vikramjit Mitra, et. al.Vikramjit Mitra ... Wen Wang
01 Dec 2014
01 Dec 2014

Unsupervised Model Adaptation for End-to-End ASR
Ganesh Sivaraman ... Matt Garland
-
Ganesh Sivaraman, et. al.Ganesh Sivaraman ... Matt Garland
23 May 2022
23 May 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving robustness against reverberation for automatic speech recognition

Abstract

Talk to us

Similar Papers