Abstract

BackgroundHealth and Demographic Surveillance Systems (HDSS) have been instrumental in advancing population and health research in low- and middle- income countries where vital registration systems are often weak. However, the utility of HDSS would be enhanced if their databases could be linked with those of local health facilities. We assess the feasibility of record linkage in rural South Africa using data from the Agincourt HDSS and a local health facility.MethodsUsing a gold standard dataset of 623 record pairs matched by means of fingerprints, we evaluate twenty record linkage scenarios (involving different identifiers, string comparison techniques and with and without clerical review) based on the Fellegi-Sunter probabilistic record linkage model. Matching rates and quality are measured by their sensitivity and positive predictive value (PPV). Background characteristics of matched and unmatched cases are compared to assess systematic bias in the resulting record-linked dataset.ResultsA hybrid approach of deterministic followed by probabilistic record linkage, and scenarios that use an extended set of identifiers including another household member’s first name yield the best results. The best fully automated record linkage scenario has a sensitivity of 83.6% and PPV of 95.1%. The sensitivity and PPV increase to 84.3% and 96.9%, respectively, when clerical review is undertaken on 10% of the record pairs. The likelihood of being linked is significantly lower for females, non-South Africans and the elderly.ConclusionUsing records matched by means of fingerprints as the gold standard, we have demonstrated the feasibility of fully automated probabilistic record linkage using identifiers that are routinely collected in health facilities in South Africa. Our study also shows that matching statistics can be improved if other identifiers (e.g., another household member’s first name) are added to the set of matching variables, and, to a lesser extent, with clerical review. Matching success is, however, correlated with background characteristics that are indicative of the instability of personal attributes over time (e.g., surname in the case of women) or with misreporting (e.g., age).

Highlights

  • Health and Demographic Surveillance Systems (HDSS) have been instrumental in advancing population and health research in low- and middle- income countries where vital registration systems are often weak

  • The level of completeness of the identifiers used as linking variables in the various scenarios is higher in the data from the Agincourt HDSS compared to that from the Agincourt Health Centre (AHC) (Table 2)

  • Another household member’s first and surname, National ID number and telephone number are often missing in the AHC dataset

Read more

Summary

Introduction

Health and Demographic Surveillance Systems (HDSS) have been instrumental in advancing population and health research in low- and middle- income countries where vital registration systems are often weak. Health and Demographic Surveillance Systems (HDSS) enumerate populations in geographically well-defined areas and prospectively collect detailed information on vital events including births, deaths, and migrations, as well as complementary data covering health, social and economic indicators [1,2,3] These data allow for population-based investigations of population and health dynamics and their determinants in low- and middle- income countries where vital registration systems are often weak [2]. In order to achieve further reductions in mortality levels, it is important to understand whether individuals dying of AIDS have had any contact with the health facilities and the nature of that contact (e.g., diagnosis, in care awaiting treatment initiation, on first line treatment) This is difficult without linking HDSS and health facility data.

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call