Abstract

Entity resolution is the process of determining whether a collection of entity representations refer to the same entity in the real world. In this paper we introduce a theoretical framework that supports knowledge-based entity resolution. From a logical point of view, the expressive power of the framework is equivalent to a decidable fragment of first-order logic including conjunction, disjunction and a certain form of negation. Although the framework is expressive for representing knowledge about entity resolution in a collective way, the questions that arise are: (1) how efficiently can knowledge patterns be processed; (2) how effectively can redundancy among knowledge patterns be eliminated. In answering these questions, we first study the evaluation problem for knowledge patterns. Our results show that this problem is NP-complete w.r.t. combined complexity but in ptime w.r.t. data complexity. This nice property leads us to investigate the containment problem for knowledge patterns, which turns out to be NP-complete. We further develop a notion of optimality for knowledge patterns and a mechanism of optimizing a knowledge model (i.e. a finite set of knowledge patterns). We prove that the optimality decision problem for knowledge patterns is still NP-complete.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call