Abstract

This paper presents a framework for Chinese spelling error detection and correction using conditional random fields (CRFs) with feature induction for secondary language learners. The trend of learning Chinese as second language is increasing recently. CRFs are adopted here as the model that models the global and local information to judge if the word is correct or not. Local features are usually considered to make the decision for intelligent systems. Herein, CRFs are one of the most used statistical approaches those which can adjust the corresponding weights for features to achieve near optimal results. This paper invested an automatic rule induction method to capture the hidden features for spellcheck in Chinese. Considering position information, the features are inducted by counting in the training corpus automatically. Therefore, the CRFs integrate the features and achieve approximated optimum by adjusting the weights corresponding to related features. From the experimental results, we can find the proposed method outperforms the traditional approaches and obtains improvement for finding the misspelled words and correcting them. It should be concluded, from what has been illustrated above that the proposed method is near to practice and useful for the learners who take Chinese as a secondary language.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call