How knowledge of speech acoustics can improve the robustness of automatic speech recognition

Carol Espy-Wilson,Abeer Alwan,Suzanne Boyce

doi:10.1121/1.3654651

Abstract

In this talk, we discuss the ways in which the parallel study of speech production and speech perception can help us develop better automatic speech recognition systems. The ultimate goal of speech recognition (recognition of spontaneous speech from any talker in any language) is still elusive due to a high degree of inter- and intra-speaker variability for production of a given sequence of sounds. While the acoustic information required for recognition may be present in the signal, its distribution, strength, and location are consistent and predictable only as a function of lawful changes in speech movements and/or listener perceptions. Understanding speech acoustics from this perspective is vitally important if we are going to achieve our ultimate goal. We will give several examples of lessons learned from studies of speech production and speech perception and how the knowledge gained can inform the engineering of robust ASR systems.

Full Text