Abstract

A new spectral representation incorporating time-frequency forward masking is proposed. This masked spectral representation is efficiently represented by a quefrency domain parameter called dynamic-cepstrum (DyC). Automatic speech recognition experiments have demonstrated that DyC powerfully improves performance in phoneme classification and phrase recognition. This new spectral representation simulates a perceived spectrum. It enhances formant transition, which provides relevant cues for phoneme perception, while suppressing temporally stationary spectral properties, such as the effect of microphone frequency characteristics or the speaker-dependent time-invariant spectral feature. These features are advantageous for speaker-independent speech recognition. DyC can efficiently represent both the instantaneous and transitional aspects of a running spectrum with a vector of the same size as a conventional cepstrum. DyC is calculated from a cepstrum time sequence using a matrix lifter. Each column vector of the matrix lifter performs spectral smoothing. Smoothing characteristics are a function of the time interval between a masker and a signal. DyC outperformed a conventional cepstrum parameter obtained through linear predictive coding (LPC) analysis for both phoneme classification and phrase recognition by using hidden Markov models (HMMs). Compared with speaker-dependent recognition, an even greater improvement over the cepstrum parameter was found in speaker-independent speech recognition. Furthermore, DyC with only 16 coefficients exhibited higher speech recognition performance than a combination of the cepstrum and a delta-cepstrum with 32 coefficients for the classification experiment of phonemes contaminated by noises.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.