Abstract

This chapter discusses the principles of entity resolution (ER). ER is the process of determining whether two references to real-world objects are referring to the same object or to different objects. The term entity describes the real world object, a person, place, or thing, and the term resolution is used because ER is fundamentally a decision process. Linking is appending a common identifier to reference instances to denote the decision that they are equivalent. Identity resolution, record linking, record matching, record deduplication, merge-purge, and entity analytics all represent particular forms or aspects of ER. In its broadest sense, ER encompasses five major activities: entity reference extraction, entity reference preparation, entity reference resolution, entity identity management, and entity relationship analysis. Exact and approximate matching are important tools used in all five ER activities, but direct matching of references is not the only method for determining reference equivalence.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call