Abstract

This paper details our work in developing a technique which can automatically generate class n-gram language models from natural language (NL) grammars in dialogue systems. The procedure eliminates the need for double maintenance of the recognizer language model and NL grammar. The resulting language model adopts the standard class n-gram framework for computational efficiency. Moreover, both the n-gram classes and training sentences are enhanced with semantic/syntactic tags defined in the NL grammar, such that the trained language model preserves the distinctive statistics associated with different word senses. We have applied this approach in several different domains and languages, and have evaluated it on our most mature dialogue systems to assess its competitiveness with preexisting n-gram language models. The speech recognition performances with the new language model are in fact the best we have achieved in both the JUPITER weather domain and the MERCURYflight reservation domain.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call