Abstract

Entity recognition (NER) is an important step for many natural language applications, such as information extraction, text summarization, and question answering. Chinese NER has some special characteristics that make this task difficult. In this paper, we present some NER experiments on the corpora used for Chinese 863 NER task in 2004 based on three models: maximum entropy, hidden Markov model (HMM) and the more recent conditional random fields (CRFs). The results show that CRFs model outperforms the other two models in the sense of best results and average performance, and model scalability among data sizes. In our experiments, CRFs model approach can achieve an overall Fl measure around 84.39/80.68 in simple/traditional Chinese NER respectively, with a gain of 2.01/10.50 over the best system in 863 competitions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call