Abstract

educed keyboards are text typing keyboards which contain fewer than 26 alphabetic keys, and which may therefore be accessed and used by certain physically disabled persons more easily than a conventional “QWERTY” typing keyboard. Automatic character disambiguation systems enable text to be typed upon a reduced keyboard with a keying efficiency approaching one key/character, despite the fact that each key represents more than one alphabetic character. Existing disambiguation systems typically use probabilistic models of character sequences (n-grams) from representative text samples to predict the next character, and hence disambiguate among the different characters on each key. N-grams for such disambiguation models have been extracted previously from large (1 million word) text corpora. The research reported here shows that a much smaller corpus of limited domain can be used with similar results, thus facilitating development of disambiguation systems by eliminating the need for a large corpus. Four reduced keyboard layouts are compared, three of which were used in earlier research on character disambiguation in the Dutch and English languages, and a fourth based on character frequency, which achieves similar efficiency to the first three. Models of different order are compared (the “order” of the model being determined by the length of the longest n-grams in it), the principal result being that, while higher-order models (containing long n-grams) give better performance than lower-order models (which contain only short n-grams), lower-order n-grams can contribute significantly to the disambiguation performance of a higher-order model, and should therefore be included in order to maximize disambiguation efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call