Abstract

BackgroundHealth care data allow for the study and surveillance of chronic diseases such as diabetes. The objective of this study was to identify and validate optimal algorithms for diabetes cases within health care administrative databases for different research purposes, populations, and data sources.MethodsWe linked health care administrative databases from Ontario, Canada to a reference standard of primary care electronic medical records (EMRs). We then identified and calculated the performance characteristics of multiple adult diabetes case definitions, using combinations of data sources and time windows.ResultsThe best algorithm to identify diabetes cases was the presence at any time of one hospitalization or physician claim for diabetes AND either one prescription for an anti-diabetic medication or one physician claim with a diabetes-specific fee code [sensitivity 84.2%, specificity 99.2%, positive predictive value (PPV) 92.5%]. Use of physician claims alone performed almost as well: three physician claims for diabetes within one year was highly specific (sensitivity 79.9%, specificity 99.1%, PPV 91.4%) and one physician claim at any time was highly sensitive (sensitivity 93.6%, specificity 91.9%, PPV 58.5%).ConclusionsThis study identifies validated algorithms to capture diabetes cases within health care administrative databases for a range of purposes, populations and data availability. These findings are useful to study trends and outcomes of diabetes using routinely-collected health care data.

Highlights

  • Health care data allow for the study and surveillance of chronic diseases such as diabetes

  • The Chronic Disease Surveillance System (CCDSS) uses routinely-collected provincial health care administrative records to identify diabetes cases, which are defined based on 1 hospitalization or 2 physician visit claims over a two-year period bearing a diagnostic code for diabetes [3]

  • The objectives of this study were to determine optimal algorithms to identify diabetes cases within health care administrative databases for different research purposes, populations, and data sources, using diabetes identified in primary care electronic medical records (EMRs) as the reference standard

Read more

Summary

Introduction

Health care data allow for the study and surveillance of chronic diseases such as diabetes. 97%, and a positive predictive value (PPV) of 80% [4] This algorithm has been used extensively for diabetes research to report epidemiologic trends [5, 6], quantify risk factors [7,8,9,10,11], evaluate outcomes [12,13,14,15], and identify health care gaps [16,17,18]. While the specificity of this definition is high, it has been shown that even modest compromises in positive predictive value increases the risk of misclassification bias [19] This may result in sizeable errors in disease prevalence in the context of relatively uncommon conditions and large sample sizes. The 2005 Ontario Diabetes Database was estimated to have a 3% ‘false positive’ rate and 16% ‘false negative’ rate, meaning that as many as 249,840 individuals were mislabelled as having diabetes and 93,102 persons without diabetes were missed altogether [19]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call