Chapter 2 - Language Models for Text Entry

Kumiko Tanaka-Ishii

doi:10.1016/b978-012373591-1/50002-4

Abstract

This chapter presents statistical language models, created by using data, which are useful for text entry. Language models provide text entry systems with the power to predict unseen text likely to be generated by a user. The huge amounts of available language data have provided an important tool that can be used to raise the efficiency of entry systems. Recent entry methods can be summarized as character-based nonpredictive methods or as word/phrase-based predictive methods with/without completion using nonadaptive/adaptive methods. All of these can be modeled by Shannon's noisy channel model and is considered as the most basic and important language models for designing them. If a nonpredictive entry method is to be designed, the language models presented in the chapter provide methods to estimate character probabilities to be considered as well as other human factors. If a predictive entry method is designed, methods to construct the initial language model and whether to incorporate adaptation should be defined. As for the initial model, if only word-based prediction is required, then an n-gram model in bigram or trigram will usually suffice.

Full Text