Abstract

In this paper, we propose a phoneme-based Chinese input method with low conflict code rate, and all input factors are phonetic symbols. In the first place, we retain two key phonetic symbols of a character as the first part of the features. That is, we reduce an effective phonetic sequence to a reduced phonetic sequence whose length is not more than 2. With the view of overcoming the difficulty of decomposing characters, we define an extended radical set which consists of 5,401 frequently-used Chinese characters, radicals, and seven primitive strokes. According to the writing sequence of a Chinese character, we can decompose a Chinese character into two extended radicals which include the first and last strokes respectively. Then, we select the first phonetic symbol of an extended radical as the phonetic feature symbol. In this way, we can obtain two phonetic feature symbols from the writing sequence of a character. When we append two phonetic feature symbols to a reduced phonetic sequence, the maximal length of the phonetic code of a Chinese character becomes 4. As far as the basic phonetic input method is concerned, the number of homonyms is 10.3844. As for the proposed phoneme-based method, the average number of characters with the same phonetic code is 1.3967. Obviously, the latter is comparatively much smaller. As a result, we can construct a phoneme-based input method with low conflict code rate, 24.72%, and the ease for the user is also improved significantly.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.