Abstract

Experiments comparing isolated word recognition by human listners with automatic speech recognition systems are valuable because error analyses may lead to improvements in speech recognition technology. Isolated word recognition in adult human listeners has been compared with recognition performance by two commercially available speech-recognition systems. The test stimuli were drawn from the Lincoln Laboratory Stressed-Speech database. The database consists of 6930 stimuli (two iterations of each of 35 words spoken by nine different people in 11 different speaking styles). The vocabulary contains confusable words (i.e., go, hello, oh, no, and zero); the speaking styles include a wide range of naturally occurring variations (i.e., normal, slow, fast, soft, loud, angry). Analyses show that the acoustic characteristics of individual words vary considerably across talkers, and across styles within talkers. Performance of human listeners and the two machine-based recognition systems was tested in a single-talker, multistyle condition, and in a multitalker, multistyle condition. All tests were conducted under two listening conditions: normal, and in the presence of masking noise. The data to be presented are the error patterns of human listeners, versus the machine-recognition systems, exhibited across talkers, across speaking styles, and across training conditions (multitalker, multistyle training versus single talker, single style training). [Work supported by Boeing Aerospace and Electronics.]

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.