Different Similarity Measures for Secure Name Matching in Record Linkage

Vijay Maruti Shelake,Narendra M Shekokar

doi:10.1201/9781003218555-18

Abstract

In the real world, the same person’s (entity’s) information is scattered across multiple databases. The record linkage task is necessary to identify the person’s name in multiple databases for data integration and analytics. There can be errors in names, including spelling mistakes and phonetic variations, which can lead to reduced accuracy for record linkage. Hence, name matching for record linkage employs similarity measures for comparing the same person’s records in databases. Because databases consist of important and personal information, the secure and privacy-preserving record linkage is essential and is carried out by encoding personal identifiers like a person’s name. The Bloom filter-based encryption mechanism is incorporated for encrypting personal identifiers. For secure record linkage, similarity measures play an important role for matching encoded names. In this chapter, the various similarity measures for secure name matching are identified and discussed. Among the various similarity measures, token-based coefficients were analyzed in terms of linkage accuracy for secure name matching.

Full Text