Abstract

In this study, the feasibility of a knowledge-based approach to speaker-independent speech recognition in the presence of impulsive environmental sounds such as knocks, clinks, and claps is examined. Statistical approaches to speech recognition have had some success in dealing with steady background, probably because they have concentrated on routinely encountered steady background sounds, most of which can be modeled as white or colored noise. However, current statistical approaches are less suited to dealing with environments containing sporadic occurrences of various discrete-event sounds because of (1) the enormous variety of discrete-event sounds and (2) discrete-event sounds can be mixed with the speech signal with different loudness and temporal alignments. In this study, experiments are being performed on feature-based speech recognition using speech sounds (from a database of spoken telephone numbers) mixed with impulsive sounds (from a database of everyday environmental impulsive sounds). An important objective of this research is to determine how different acoustic cues (formants, pitch, frication noise, stop bursts, etc.) are influenced by the presence of different impulsive sounds. [This research was supported by NSF Research Grant No. IRI-9300194.]

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call