Abstract

There is a strong and continuously growing interest in using large electronic healthcare databases to study health outcomes and the effects of pharmaceutical products. However, concerns regarding disease misclassification (i.e. classification errors of the disease status) and its impact on the study results are legitimate. Validation is therefore increasingly recognized as an essential component of database research. In this work, we elucidate the interrelations between the true prevalence of a disease in a database population (i.e. prevalence assuming no disease misclassification), the observed prevalence subject to disease misclassification, and the most common validity indices: sensitivity, specificity, positive and negative predictive value. Based on this, we obtained analytical expressions to derive all the validity indices and true prevalence from the observed prevalence and any combination of two other parameters. The analytical expressions can be used for various purposes. Most notably, they can be used to obtain an estimate of the observed prevalence adjusted for outcome misclassification from any combination of two validity indices and to derive validity indices from each other which would otherwise be difficult to obtain. To allow researchers to easily use the analytical expressions, we additionally developed a user-friendly and freely available web-application.

Highlights

  • Epidemiology relies on accurately capturing the disease status of subjects within a certain population

  • Electronic healthcare record databases, which have become a prominent source of information in pharmacoepidemiology, are prone the disease misclassification. electronic healthcare record (eHR) databases capture healthcare provided to large populations, their size permits the study of rare events and their establishment within clinical practices enables studying real

  • Starting from the observed prevalence, the PPV and the true prevalence led to a SE of 88.5% (84.4–92.6), close to the study estimate of 89.3%

Read more

Summary

Introduction

Epidemiology relies on accurately capturing the disease status of subjects within a certain population. Inaccuracies in obtaining the disease status might (strongly) bias the epidemiological findings. Electronic healthcare record (eHR) databases, which have become a prominent source of information in pharmacoepidemiology, are prone the disease misclassification. EHR databases capture healthcare provided to large populations, their size permits the study of rare events and their establishment within clinical practices enables studying real-. Disease misclassification in healthcare databases: Deriving validity indices from each other Electronic healthcare record (eHR) databases, which have become a prominent source of information in pharmacoepidemiology, are prone the disease misclassification. eHR databases capture healthcare provided to large populations, their size permits the study of rare events and their establishment within clinical practices enables studying real-

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.