Abstract

In his JASA (2002) paper, Ken Stevens proposed a model of human speech recognition based on extracting acoustic cues to the distinctive feature contrasts of the speaker's intended words. Following Halle (1995), he distinguished between cues to manner features (e.g., abrupt spectral events associated with constrictions/widenings of the vocal tract, called landmarks) vs. cues to other features related to voicing and place, and proposed that landmark detection is an early step in perception. Landmark cues are particularly useful: they are reliably produced, robustly detectable, and highly informative about the structure and lexical content of an utterance; they also identify adjacent regions rich in cues to additional features. During his working life, Ken's students developed many of the modules required to detect feature cues, meanwhile discovering important aspects of their systematic context-governed variation. Current work aims at (1) completing the speech analysis system based on detection of feature cues and parameter values, (2) evaluating its performance, and (3) comparing its performance to human perceptual behavior. A system based on Ken's insights will have implications for human speech processing models, for knowledge-based approaches to ASR, and for a deeper understanding of the mechanisms underlying clinical speech problems as well as language learning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call