Abstract
For improving the recognition performance of Chinese named entities, transformation based machine learning has been introduced to repair errors caused during word segmentation and part-of-speech (POS) tagging. Since Chinese is not a segmented language, the words in a sentence must be segmented before they are processed by consequent Chinese named entity recognition components. Similarly, POS tagging is also an important fundamental task for Chinese named entity recognition. In order to enhance the quality of word segmentation and POS tagging, it is necessary to explore different approaches for improving the performance. One of the approaches is to repair errors as much as possible, if word segmentation and POS tagging tool is available on hand. This paper aims at introducing an effective error repairer using transformation based error-driven machine learning technique. It deals with detecting error positions, producing error repairing rules, selecting higher-score rules, ordering rules and distinguishing rule usage conditions, etc. The experimental results show that word segmentation and POS tagging errors are significantly reduced and the performance has been improved.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.