Abstract

With the rapid growth of the available information on the Internet, it is more difficult for us to find the relevant information quickly on the Web. Named Entity Recognition (NER), one of the key techniques in some web information processing tools such as information retrieval and information extraction, has been paid more and more attention. In this paper we address the problem of Chinese NER using a hybrid-statistical model. This study is concentrated on entity names (personal names, location names and organization names), temporal expressions (dates and times) and number expressions. The method is characterized as follows: firstly, NER and Part-of-Speech tagging have been integrated into a unified framework; secondly, it combines Hidden Markov Model (HMM) with Maximum Entropy Model (MEM) by taking MEM as a sub-model invoked in Viterbi algorithm; thirdly, the Part-of-Speech information of the context has been used in MEM. The experiment shows that the hybrid-statistical model could achieve preferable results of Chinese NER, in which the F1 value ranges from 74% to 92% for all kinds of named entities on an open-test data.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call