Subword-Based Large-Vocabulary Speech Recognition

Chin-Hui Lee,Jean-Luc Gauvain,Roberto Pieraccini,Lawrence R Rabiner

doi:10.1002/j.1538-7305.1993.tb00652.x

Abstract

During the past several years, research in large-vocabulary speech recognition has been intensively carried out worldwide, encouraged by advances in algorithms, architecture, and hardware. In the United States, the defense advanced-research projects agency (DARPA) spoken-language-processing community has focused its efforts on studying several systems. These include the 991-word naval resource management (RM) speech-recognition task, the open-vocabulary, spontaneous-speech, air-travel information system (ATIS) speech-understanding task, and the 20,000-word Wall Street Journal (WSJ) dictation task. Although researchers have learned a great deal about how to build and efficiently implement large-vocabulary speech-recognition systems, many fundamental questions remain, for which there are no definitive answers. This paper focuses on the basic structure of a large-vocabulary speech-recognition system, considerations in choosing a set of subword units, method of “training,” integration of a language model, and implementation of a complete system. The paper also reports on some recent results, obtained at AT&T Bell Laboratories, on the DARPA RM task.

Full Text