Abstract

In this paper, a statistic language model is put forward to predict the next inputting word to improve the performance of the input method. So this paper constructs a general language model and a user language model, and then combines them into a new language model which was called as dynamic and self-study language model. Using the general language model in our experiment, the average length of input codes (ALIC) is reduced from 2.557 to 2.479 and the hit rate of first characters (HRFC) is also improved from 78.704% to 96.202%. Using the dynamic and self-study language model in our experiment, when the number of inputted Chinese characters is less then 20 thousand, the HRFC increases rapidly, while the ALIC reduces rapidly. And when the number is greater than 20 thousand, the HRFC and ALIC become steady. Thus it’s clear that dynamic and self-study language model performs well in input method. Otherwise, we provide a modified Church-Gale smoothing method to reduce the size of general language model. This method can reduce the size to 5 percent in order to fit the request of handheld device.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.