Abstract

Describes the segmentation and broad classification used in the Carnegie Mellon University (CMU) speaker independent continuous speech recognition system. The objective is to segment the speech such that the segment boundaries define the phonetic events in the speech waveform and classify the segments into one of three broad classes: silence, sonorant, fricative. Because this objective is difficult and, in many cases, uncertain, the authors produce a network that provides reasonable alternative segmentations. Their approach has three components. First, the speech is segmented in a hierarchical fashion. Next, each of the segments is assigned broad class probabilities. And finally, a network is generated using the hierarchical segmentation, the broad class probabilities and additional speech knowledge. Experiments demonstrate that networks generated using additional speech knowledge are superior to those generated purely from the hierarchical segmentation. >

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call