ObjectivesTo examine the validity of deterministic compared to probabilistic record linkage in the ascertainment of hospitalizations in two linked cohorts. Study Design and SettingHIV-negative (HIV-ve) (n = 1,325) and HIV-positive (HIV+ve) gay and bisexual men (n = 557) recruited in Sydney, Australia, were probabilistically and deterministically linked to a statewide hospital registry (July 2000–June 2012). ResultsUsing probabilistic linkage as the reference standard, deterministic linkage had higher specificity but much lower sensitivity [34.67% (95% confidence interval: 33.44, 35.92)]. A disproportionate number of links missed were individuals with poorer socioeconomic and health indicators, including HIV status. Risk of hospitalization compared to the general male population [HIV+ve standardized incidence ratio (SIR) = 1.45 (1.33–1.59); HIV-ve SIR = 0.72 (0.67–0.78)] was significantly underestimated when deterministic linkage was used [HIV+ve SIR = 0.46 (0.37–0.58); HIV-ve SIR = 0.29 (0.24–0.35)]. The impact of linkage strategy on the calculation of incidence rate ratios (IRRs) was less, but a greater discrepancy in IRRs was seen for diagnostic categories where event rates were low or where the sensitivity of the deterministic linkage was differential between the two cohorts. ConclusionLinkage without proven high sensitivity and specificity should be carefully considered. In circumstances of undetermined sensitivity, SIRs should not be calculated as the extent of underestimation is unknown. The comparison of linked events within or between cohorts is more robust to linkage misclassification; however, selection bias does affect estimates and should be considered before linkage.
Read full abstract