Improved Frequency Modulation Features for Multichannel Distant Speech Recognition

Isidoros Rodomagoulakis,Petros Maragos

doi:10.1109/jstsp.2019.2923372

Abstract

Frequency modulation features capture the fine structure of speech formants that constitute beneficial and supplementary to the traditional energy-based cepstral features. Improvements have been demonstrated mainly in GMM-HMM systems for small and large vocabulary tasks. Yet, they have limited applications in DNN-HMM systems and Distant Speech Recognition (DSR) tasks. Herein, we elaborate on their integration within state-of-the-art front-end schemes that include post-processing of MFCCs resulting in discriminant and speaker adapted features of large temporal contexts. We explore 1) multichannel demodulation schemes for multi-microphone setups, 2) richer descriptors of frequency modulations, and 3) feature transformation and combination via hierarchical deep networks. We present results for tandem and hybrid recognition with GMM and DNN acoustic models, respectively. The improved modulation features are combined efficiently with MFCCs yielding modest and consistent improvements in multichannel distant speech recognition tasks on reverberant and noisy environments, where recognition rates are far from human performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improved Frequency Modulation Features for Multichannel Distant Speech Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Journal of Selected Topics in Signal Processing

Lead the way for us

Journal: IEEE Journal of Selected Topics in Signal Processing	Publication Date: Nov 23, 2018
Citations: 49

Similar Papers

Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR
Tara N Sainath ... David Nahamoo
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 19
Tara N Sainath, et. al.Tara N Sainath ... David Nahamoo
01 Nov 2011
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 19

On the improvement of modulation features using multi-microphone energy tracking for robust distant speech recognition
Isidoros Rodomagoulakis ... Petros Maragos
-
Isidoros Rodomagoulakis, et. al.Isidoros Rodomagoulakis ... Petros Maragos
01 Aug 2017
01 Aug 2017

Temporal AM–FM combination for robust speech recognition
Yotaro Kubo ... Katsuhiko Shirai
Speech Communication | VOL. 53
Yotaro Kubo, et. al.Yotaro Kubo ... Katsuhiko Shirai
01 Sep 2010
Speech Communication | VOL. 53

Articulatory motivated acoustic features for speech recognition
Daniil Kocharov ... Ralf Schlüter
-
Daniil Kocharov, et. al.Daniil Kocharov ... Ralf Schlüter
04 Sep 2005
04 Sep 2005

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improved Frequency Modulation Features for Multichannel Distant Speech Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Journal of Selected Topics in Signal Processing