Abstract

The typical input to automatic speech recognition (ASR) algorithms is a word-length or longer acoustic waveform. The typical output consists of names of items identified from stored vocabularies. Most algorithms also use scoring procedures to find the top matches between the input and the stored information. These evaluation scores, probabilities or distance metrics, may also be output. There are many nonstandard applications of ASR technology that combine identification and evaluation scores. Examples to be discussed include speech training to improve intelligibility, spoken language proficiency of non-native speakers, and language training for adults with developmental disabilities. Most current ASR algorithms are fine tuned for species-specific properties of human speech and language. In addition, many incorporate psychophysical properties of the human auditory system in the initial signal processing. However, the underlying pattern recognition algorithms are also applicable to a wide range of animal vocalizations. Some commercial and laboratory systems will be discussed in relation to nonstandard applications of ASR. [Work supported by NIH, RO1 DC-02229, R44 DC-02213, and R43 HD-35425.]

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call