Abstract

Listeners can recognize sine-wave replicas of utterances synthesized using three time-varying sinusoids [Remez, Science 212, 947–950 (1981)]. Amplitude comodulation of such sine-wave sentences (SWS) further improves intelligibility [Carrell and Opie, Percept. Psychophys. 52, 437–445 (1992)]. In this study an automatic speech recognition (ASR) task investigated two issues. First, is the increased intelligibility of comodulated SWS sentences due to greater resemblance to natural speech? Second, is it necessary to acquire new speech schema for SWS sentences, or can existing schema be accessed using a different strategy? An ASR system was trained and tested on SWS stimuli giving a word recognition rate of 85%, compared to 92% for a system trained and tested on natural speech. Comodulated SWS was recognized at 85%. It appears that the information content of SWS is little below that of speech and is unaffected by comodulation. Testing SWS sentences on models trained on natural speech resulted in low recognition (5%). A second experiment modified the recognition strategy using occluded speech recognition techniques [Green etal ., ICASSP 401–404 (1995)] and gave performance for SWS recognition based on natural utterance models of 46%. These results suggest that SWS recognition does not necessarily rely on acquiring new SWS schemas.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.