Abstract

The task of Cross-Document Co-reference Resolution(CDCR) aims to merge those words distributed in different texts which refer to the same entity together to form co-reference chains.The traditional research on CDCR addresses name disambiguation posed in information retrieval using clustering methods.This paper transformed CDCR as a classification problem by using an Support Vector Machine(SVM) classifier to resolve both name disambiguation and variant consolidation,both of which were prevalent in information extraction.This method can effectively integrate various features,such as morphological,phonetic,and semantic knowledge collected from the corpus and the Internet.The experiment on a Chinese cross-document co-reference corpus shows the classification method outperforms clustering methods in both precision and recall.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call