Abstract

Research in large vocabulary speech recognition has been intensively carried out worldwide, in the past several years, spurred on by advances in algorithms, architectures and hardware. In the United States, the DARPA community has focused efforts on studying several continuous speech recognition tasks including Naval Resource Management, a 991 word task, ATIS (Air Travel Information System), a speech understanding task with an open vocabulary (in practice on the order of several thousand words) and a natural language component, and Wall Street Journal, a voice dictation task with a vocabulary on the order of 20,000 words. Although we have learned a great deal about how to build and efficiently implement large vocabulary speech recognition systems, there remain a whole range of fundamental questions for which we have no definitive answers. In this paper we review the basic structure of a large vocabulary speech recognition system, address the basic system design issues, discuss the considerations in the selection of training material, choice of subword unit, method of training and adaptation of models of subword units, integration of language model, and implementation of the overall system, and report on some recent results, obtained at AT&T Bell Laboratories, on the Resource Management task.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call