Multi-resolution sub-band features and models for HMM-based phonetic modelling

P.M Mccourt,S.V Vaseghi,B Doherty

doi:10.1006/csla.2000.0145

Abstract

HMM acoustic models are typically trained on a single set of cepstral features extracted over the full bandwidth of mel-spaced filterbank energies. In this paper, multi-resolution sub-band transformations of the log energy spectra are introduced based on the conjecture that additional cues for phonetic discrimination may exist in the local spectral correlates not captured by the full-band analysis. In this approach the discriminative contribution from sub-band features is considered to supplement rather than substitute for full-band features. HMMs trained on concatenated multi-resolution cepstral features are investigated, along with models based on linearly combined independent multi-resolution streams, in which the sub-band and full-band streams represent different resolutions of the same signal. For the stream-based models, discriminative training of the linear combination weights to a minimum classification error criteria is also applied. Both the concatenated feature and the independent stream modelling configurations are demonstrated to outperform traditional full-band cepstra for HMM-based acoustic phonetic modelling on the TIMIT database. Experiments on context-independent modelling achieve a best increase on the core test set from an accuracy of 62.3% for full-band models to a 67.5% accuracy for discriminately weighted multi-resolution sub-band modelling. A triphone accuracy of 73.9% achieved on the core test set improves notably on full-band cepstra and compares well with results previously published on this task.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-resolution sub-band features and models for HMM-based phonetic modelling

Abstract

Talk to us

Similar Papers

More From: Computer Speech & Language

Lead the way for us

Journal: Computer Speech & Language	Publication Date: Jul 1, 2000
Citations: 4

Similar Papers

Discriminative spectral-temporal multiresolution features for speech recognition
P Mcmahon ... N Harte
-
P Mcmahon, et. al.P Mcmahon ... N Harte
01 Jan 1998
01 Jan 1998

Multi-resolution cepstral features for phoneme recognition across speech sub-bands
P Mccourt ... S Vaseght
-
P Mccourt, et. al.P Mccourt ... S Vaseght
12 May 1998
12 May 1998

Mel-scaled discrete wavelet coefficients for speech recognition
J.N Gowdy ... Z Tufekci
-
J.N Gowdy, et. al.J.N Gowdy ... Z Tufekci
05 Jun 2000
05 Jun 2000

Hierarchical subband linear predictive cepstral (HSLPC) features for HMM-based speech recognition
R Chengalvarayan
-
R ChengalvarayanR Chengalvarayan
01 Jan 1998
01 Jan 1998

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-resolution sub-band features and models for HMM-based phonetic modelling

Abstract

Talk to us

Similar Papers

More From: Computer Speech & Language