Abstract

The Speech Systems Incorporated (SSI) commercial, large-vocabulary, speaker-independent, continuous speech recognition system is described. The system utilizes a novel approach to speech representation: a two-stage encoding of speech, with an intervening compression of acoustic frames (segmentation) between the encoding stages, and a linguistic decoding process suitable for large, variable-duration segments. Binary decision trees trained using the maximum mutual information (MMI) criterion serve as encoders. The features used in encoding are listed, and their ability to discriminate the phonetic content of the speech is analyzed. Recognition results are given for a speaker-independent continuous speech, grammar-constrained radiology reporting product, and for an isolated-word grammar of high perplexity. >

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call