Multivariate Autoregressive Spectrogram Modeling for Noisy Speech Recognition

Sriram Ganapathy

doi:10.1109/lsp.2017.2724561

Abstract

The performance of an automatic speech recognition (ASR) system is highly degraded in the presence of noise and reverberation. The autoregressive (AR) modeling approach, which preserves the high energy regions of the signal that are less susceptible to noise, first, presents a potential method for robust feature extraction. Second, there are strong correlations in the spectrotemporal domain of the speech signal, which are generally absent in noise. In this letter, we propose a novel method for speech feature extraction, which combines the advantages of AR approach and joint time-frequency processing using the multivariate AR modeling (MAR). Specifically, the subband discrete cosine transform coefficients obtained from multiple speech bands are used in the MAR framework to derive the Riesz temporal envelopes that provide features for ASR. We perform several speech recognition experiments in the Aurora-4 database with clean and multicondition training. In these experiments, the proposed features provide significant improvements over other noise robust feature extraction methods (relative improvements of 24% in clean training and 14% in multicondition training over mel features). Furthermore, the speech recognition experiments in REVERB challenge database illustrates the extension of the MAR modeling method for suppressing reverberant artifacts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multivariate Autoregressive Spectrogram Modeling for Noisy Speech Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Signal Processing Letters

Lead the way for us

Journal: IEEE Signal Processing Letters	Publication Date: Sep 1, 2017
Citations: 17

Similar Papers

Temporal AM–FM combination for robust speech recognition
Yotaro Kubo ... Katsuhiko Shirai
Speech Communication | VOL. 53
Yotaro Kubo, et. al.Yotaro Kubo ... Katsuhiko Shirai
01 Sep 2010
Speech Communication | VOL. 53

Modulation Filter Learning Using Deep Variational Networks for Robust Speech Recognition
Purvi Agrawal ... Sriram Ganapathy
IEEE Journal of Selected Topics in Signal Processing | VOL. 13
Purvi Agrawal, et. al.Purvi Agrawal ... Sriram Ganapathy
01 May 2019
IEEE Journal of Selected Topics in Signal Processing | VOL. 13

Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition
Md Jahangir Alam ... Douglas O’Shaughnessy
Speech Communication | VOL. 73
Md Jahangir Alam, et. al.Md Jahangir Alam ... Douglas O’Shaughnessy
29 Jul 2015
Speech Communication | VOL. 73

A feature extraction method using subband based periodicity and aperiodicity decomposition with noise robust frontend processing for automatic speech recognition
Kentaro Ishizuka ... Tomohiro Nakatani
Speech Communication | VOL. 48
Kentaro Ishizuka, et. al.Kentaro Ishizuka ... Tomohiro Nakatani
21 Jul 2006
Speech Communication | VOL. 48

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multivariate Autoregressive Spectrogram Modeling for Noisy Speech Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Signal Processing Letters