Abstract
This paper proposes a new approach for personal name recognition in Chinese language domain. Combining rule-based and statistical method, we consider wonderful linguistics knowledge; firstly step, we collect personal name as candidate entity, and send it into statistical model to decide whether it is the relevant entity, the conditional random fields (CRFs) is used in this paper. At the same time, the dynamic priority method is proposed to solve the difficulty that the section of a foreign personal name would be recognized a Chinese personal name. Moreover, model including features as follows: probabilistic feature functions are used instead of binary feature functions, it is one of the several differences between this model and the most of the previous CRFs based model. We also explore several new features in our model, which includes confidence functions, context semantic and contextual surroundings. Like those in some previous works, we use sub-models to model Chinese personal names, Foreign personal names and abbreviation personal name respectively, but we bring some new techniques in these sub-models. Experimental results show our CRFs model combining above new elements brings significant improvements.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.