Abstract
BackgroundCohort studies can provide valuable evidence of cause and effect relationships but are subject to loss of participants over time, limiting the validity of findings. Computerised record linkage offers a passive and ongoing method of obtaining health outcomes from existing routinely collected data sources. However, the quality of record linkage is reliant upon the availability and accuracy of common identifying variables. We sought to develop and validate a method for linking a cohort study to a state-wide hospital admissions dataset with limited availability of unique identifying variables.MethodsA sample of 2000 participants from a cohort study (n = 41 514) was linked to a state-wide hospitalisations dataset in Victoria, Australia using the national health insurance (Medicare) number and demographic data as identifying variables. Availability of the health insurance number was limited in both datasets; therefore linkage was undertaken both with and without use of this number and agreement tested between both algorithms. Sensitivity was calculated for a sub-sample of 101 participants with a hospital admission confirmed by medical record review.ResultsOf the 2000 study participants, 85% were found to have a record in the hospitalisations dataset when the national health insurance number and sex were used as linkage variables and 92% when demographic details only were used. When agreement between the two methods was tested the disagreement fraction was 9%, mainly due to "false positive" links when demographic details only were used. A final algorithm that used multiple combinations of identifying variables resulted in a match proportion of 87%. Sensitivity of this final linkage was 95%.ConclusionsHigh quality record linkage of cohort data with a hospitalisations dataset that has limited identifiers can be achieved using combinations of a national health insurance number and demographic data as identifying variables.
Highlights
Cohort studies can provide valuable evidence of cause and effect relationships but are subject to loss of participants over time, limiting the validity of findings
The longstanding Framingham cohort study was critical in demonstrating the relationship between certain risk factors and the development of cardiovascular disease (CVD) events during follow-up [2,3,4]
While full names and addresses are not included in the dataset, approximately 81% of records in the Victorian Admitted Episodes Dataset (VAED) include Medicare card number, the national health insurance number allocated to all Australians
Summary
Cohort studies can provide valuable evidence of cause and effect relationships but are subject to loss of participants over time, limiting the validity of findings. The quality of evidence from cohort studies relies on complete and accurate ascertainment of outcomes such as myocardial infarction or stroke. Information about these and other health outcomes can be collected in a variety of ways, including medical record review and self-report from participants. While the former is considered the “gold standard” [5,6], it is resource intensive for large cohorts. Specific groups at risk of loss to follow-up include those of lower socioeconomic status and those with poorer health, often the groups of major interest to epidemiological research
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have