Abstract
This paper proposes a feature extraction method that represents both the periodicity and aperiodicity of speech for robust speech recognition. The development of this feature extraction method was motivated by findings in speech perception research. With this method, the speech sound is filtered by Gammatone filter banks, and then the output of each filter is comb filtered. Individual comb filters designed for each output signal of the Gammatone filter are used to divide the output of each filter into its periodic and aperiodic features in the sub band. The power suppressed by comb filtering is considered to be a periodic feature, whereas the power of the residue after comb filtering is considered to be an aperiodic feature. This method uses both features as the feature parameters for automatic speech recognition. A preliminary experiment using a five vowel recognition task designed to compare the proposed approach with the conventional MFCC-based feature extraction method shows that the proposed method improves vowel recognition rates by as much as 14.7 % in the presence of pink noise or a harmonic complex tone interferer. An evaluation experiment undertaken using the Aurora-2J database (Japanese noisy digit recognition database) to compare the proposed approach with the MFCC-based conventional (baseline) feature extraction method shows that the proposed method reduces the word error rate by as much as 59.62 %, with an average value of 18.21 %.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have