Abstract

This paper describes an investigation on the possibility of adding new features to classical Mel Scaled Cepstral Coefficients (MFCC) and their time derivatives. A hybrid Automatic Speech Recognition (ASR) system is used based on a Neural Network (NN) and a collection of Hidden Markov Models (HMM). It is shown that the gravity centres (GC) of energies in the frequency bands of the first three formants and their first and second time derivatives can be added to the classical set of MFCCs and their first and second time derivatives, resulting in significant performance improvements. Nevertheless, in some cases, the added parameters may nave a negative effect on performance, because the parameters are reliable only for certain types of sounds as their values may exhibit large variations for the same sound in the presence of additive noise. Experiments have shown that one solution is that of introducing a reliability index indicating the importance the newly added parameters should have in describing a given frame. NNs appear to be suitable devices for taking this fact into account in the computation of observation probabilities. Experiments have also shown improvements when GCs are computed from zero-crossing intervals detected at the output of the filters of an ear model. Intensities are obtained by associating a nonlinear peak amplitude coding to each zero-crossing interval. Consistent improvements are observed when the above-mentioned solutions are applied with medium as well as large size lexicons in the presence of additive noise.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.