Abstract

In this paper, we consider the entity resolution(ER) problem, which is to identify objects referring to the same real-world entity. Prior work of ER involves expensive similarity comparison and clustering approaches. Additionally, the quality of entity resolution may be low due to insufficient information. To address these problems, by adopting context information of data objects, we present a novel framework of entity resolution, context-based entity description (CED), to make context information help entity resolution. In our framework, each entity is described by a set of CEDs. During entity resolution, objects are only compared with CEDs to determine its corresponding entity. Additionally, we propose efficient algorithms for CED discovery and CED-based entity resolution. We experimentally evaluated our CED-based ER algorithm on the real DBLP datasets, and the experimental results show that our algorithm can achieve both high precision and recall as well as outperform existing methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call