Abstract

With the development of natural language processing (NLP) technology, the need for automatic named entity recognition (NER) is highlighted in order to enhance the performance of information extraction systems. In this paper, a hybrid model for Chinese person based on conditional random fields model is proposed, which fuses multiple features. It differentiates from most of the previous approaches, it use the same linguistics model to recognize Chinese person name and transliterated person name, where, combining multi-knowledge and multi-features, the inner-feature and its context information of person were considered. Analyzing context component of word information and inner particle information of entity, the multi-features can be integrated into a unified framework which includes the local feature, relation feature, globe feature and heuristic human knowledge. Based conditional random fields, the new linguistics model are built. And the performance were be improved, The experimental results show that the precision is 94.87%, the recall is 93.76% and the F-measure is 94.31% in People's Daily (January, 1998), And the experiments on MSAR corpus of the SIGHAN 2006 also confirm the better performance, which show that this hybrid model has consistence on different testing data sources. This can prove the validity of this approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call