Abstract

In this paper, we evaluated the predictive performance of probabilistic record linkage algorithms, discussing the implications of different configurations of blocking keys, string similarity functions and phonetic code on the prediction’s overall performance and computational complexity. Furthermore, we carried out a bibliographical survey of the main deterministic and probabilistic record linkage methods, as well as of recent advances combining machine learning techniques and main packages and implementations available in open-source R language. The results can provide heuristics for problems of administrative records integration at the national level and have potential value for the formulation and evaluation of public policies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call