Abstract

Optimal representation of acoustic features is an ongoing challenge in automatic speech recognition research. As an initial step toward this purpose, optimization of filterbanks for the cepstral coefficient using evolutionary optimization methods is proposed in some approaches. However, the large number of optimization parameters required by a filterbank makes it difficult to guarantee that an individual optimized filterbank can provide the best representation for phoneme classification. Moreover, in many cases, a number of potential solutions are obtained. Each solution presents discrimination between specific groups of phonemes. In other words, each filterbank has its own particular advantage. Therefore, the aggregation of the discriminative information provided by filterbanks is demanding challenging task. In this study, the optimization of a number of complementary filterbanks is considered to provide a different representation of speech signals for phoneme classification using the hidden Markov model (HMM). Fuzzy information fusion is used to aggregate the decisions provided by HMMs. Fuzzy theory can effectively handle the uncertainties of classifiers trained with different representations of speech data. In this study, the output of the HMM classifiers of each expert is fused using a fuzzy decision fusion scheme. The decision fusion employed a global and local confidence measurement to formulate the reliability of each classifier based on both the global and local context when making overall decisions. Experiments were conducted based on clean and noisy phonetic samples. The proposed method outperformed conventional Mel frequency cepstral coefficients under both conditions in terms of overall phoneme classification accuracy. The fuzzy fusion scheme was shown to be capable of the aggregation of complementary information provided by each filterbank.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.