Chapter 6 - Robust Speech Recognition Under Noisy Ambient Conditions

Kuldip K Paliwal,Kaisheng Yao

doi:10.1016/b978-0-12-374708-2.00006-1

Abstract

This chapter provides an overview of an automatic speech recognition system and describes sources of speech variability that cause mismatch between training and testing. It also discusses some of the current techniques to achieve robust speech recognition. Automatic speech recognition is critical in natural human-centric interfaces for ambient intelligence. The performance of an automatic speech recognition system, however, degrades drastically when there is a mismatch between training and testing conditions. The aim of robust speech recognition is to overcome the mismatch problem so the result is a moderate and graceful degradation in recognition performance. The main factors that have made speech recognition possible are advances in digital signal processing (DSP) and stochastic modeling algorithms. Signal processing techniques are important for extracting reliable acoustic features from the speech signal, and stochastic modeling algorithms are useful for representing speech utterances in the form of efficient models, such as hidden Markov models (HMMs), which simplify the speech recognition task. Other factors responsible for the commercial success of speech recognition technology include the availability of fast processors (in the form of DSP chips) and high-density memories at relatively low cost. In the design of a practical robust speech recognition system for ambient intelligence, computational complexity is a very important factor. Thus, it is worthwhile to revise robust speech recognition methods in order to achieve simplified procedures, albeit with some performance losses. Balancing performance and computational cost for robust speech recognition for ambient intelligence will be a design art.

Full Text