Abstract

This paper considers five features, personal titles, community chains, terms, temporal expressions, and hostnames for personal name disambiguation. In 9 test data sets covering 3 ambiguous personal names, we address the issues of awareness degree of an entity, the source of materials and Web pages in different areas. Two approaches, single-clusterer and cascaded multiple-clusterer, are proposed. In the experiments, the proposed features are quite useful; the multiple-clusterer approach is better than the single-clusterer approach; and expanding community chains using the Web has positive effects on personal name disambiguation

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call