Objective To evaluate alternative approaches to correct for bias due to inaccurate diagnostic criteria in database studies of associations. Study Design and Settings A simulation study of a hypothetical cohort of 10,000 subjects selected based on database-derived diagnostic criteria with positive predictive value (PPV) of either 53% or 80%. Analyses focus on the putative association between a drug and the time to a negative outcome. The association is confounded for “false positive” subjects, where the drug acts as a marker for unobserved frailty. First, we estimate the conventional multivariable Cox's Model 1. We then assume having in-depth evaluation of a fraction of subjects, which permits estimating the probabilities of having the disease for all subjects in the cohort. Alternative correction methods use the estimated probability as a confounder (Model 2), a modifier of the drug effect (Model 3), or an importance weight (Model 4). Results With a PPV of 53%, Models 1 and 2 induced about 50% underestimation bias for the drug effect. Interaction-based Model 3 yielded the least biased estimates (25% bias), whereas weighting by probability (Model 4) resulted in slightly more biased (33%), but more stable estimates. Conclusion Proposed methods help reducing bias due to sample contamination.
Read full abstract