Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures

Jonathan Darch,Ben Milner,Saeed Vaseghi

doi:10.1121/1.2997436

Abstract

The aim of this work is to develop methods that enable acoustic speech features to be predicted from mel-frequency cepstral coefficient (MFCC) vectors as may be encountered in distributed speech recognition architectures. The work begins with a detailed analysis of the multiple correlation between acoustic speech features and MFCC vectors. This confirms the existence of correlation, which is found to be higher when measured within specific phonemes rather than globally across all speech sounds. The correlation analysis leads to the development of a statistical method of predicting acoustic speech features from MFCC vectors that utilizes a network of hidden Markov models (HMMs) to localize prediction to specific phonemes. Within each HMM, the joint density of acoustic features and MFCC vectors is modeled and used to make a maximum a posteriori prediction. Experimental results are presented across a range of conditions, such as with speaker-dependent, gender-dependent, and gender-independent constraints, and these show that acoustic speech features can be predicted from MFCC vectors with good accuracy. A comparison is also made against an alternative scheme that substitutes the higher-order MFCCs with acoustic features for transmission. This delivers accurate acoustic features but at the expense of a significant reduction in speech recognition accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The Journal of the Acoustical Society of America	Publication Date: Dec 1, 2008
Citations: 10	License type: other-oa

R Discovery Prime

R Discovery Prime

Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Similar Papers

Pitch prediction from Mel-frequency cepstral coefficients using sparse spectrum recovery
M V Achuth Rao ... Prasanta Kumar Ghosh
-
M V Achuth Rao, et. al.M V Achuth Rao ... Prasanta Kumar Ghosh
01 Mar 2017
01 Mar 2017

On the use of variable frame rate analysis in speech recognition
Qifeng Zhu ... A Alwan
-
Qifeng Zhu, et. al. Qifeng Zhu ... A Alwan
05 Jun 2000
05 Jun 2000

Robust Acoustic Speech Feature Prediction From Noisy Mel-Frequency Cepstral Coefficients
Ben Milner ... Jonathan Darch
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 19
Ben Milner, et. al.Ben Milner ... Jonathan Darch
01 Feb 2011
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 19

HMM-based MAP prediction of voiced and unvoiced formant frequencies from noisy MFCC vectors
Jonathan Darch ... Ben Milner
-
Jonathan Darch, et. al.Jonathan Darch ... Ben Milner
17 Sep 2006
17 Sep 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America