Abstract

Constraints ubiquitously exist in many real-life applications for entity resolution. However, it is always challenging to effectively specify and use such constraints for performing ER tasks. In particular, not every constraint is equally robust. Adding weights to express the “confidence” on constraints thus becomes a natural choice. In this paper, the authors study entity resolution (ER), the problem of determining which records in one or more databases refer to the same entities, in the presence of weighted constraints. They propose a unified framework that allows us to associate a weight for each constraint, capturing the confidence for its robustness in an ER model. The authors develop an approach to learn weighted constraints based on domain knowledge, and investigate how effectively and efficiently weighted constraints can be used for generating an ER clustering and for determining a propagation order across multiple entity types. Their experimental study shows that using weighted constraints can lead to improved ER quality and scalability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call