How do endometriosis diagnoses and subtypes reported in administrative health data compare with surgically confirmed disease? For endometriosis diagnosis, we observed substantial agreement and high sensitivity and specificity between administrative health data-International Classification of Diseases (ICD) 9 codes-and surgically confirmed diagnoses among participants who underwent gynecologic laparoscopy or laparotomy. Several studies have assessed the validity of self-reported endometriosis in comparison to medical record reporting, finding strong confirmation. We previously reported high inter- and intra-surgeon agreement for endometriosis diagnosis in the Endometriosis, Natural History, Diagnosis, and Outcomes (ENDO) Study. In this validation study, participants (n = 412) of the Utah operative cohort of the ENDO Study (2007-2009) were linked to medical records from the Utah Population Database (UPDB) to compare endometriosis diagnoses from each source. The UPDB is a unique database containing linked data on over 11 million individuals, including statewide ambulatory and inpatient records, state vital records, and University of Utah Health and Intermountain Healthcare electronic healthcare records, capturing most Utah residents. The ENDO operative cohort consisted of individuals aged 18-44 years with no prior endometriosis diagnosis who underwent gynecologic laparoscopy or laparotomy for a variety of surgical indications. In total, 173 women were diagnosed with endometriosis based on surgical visualization of disease, 35% with superficial endometriosis, 9% with ovarian endometriomas, and 14% with deep infiltrating endometriosis. Contemporary administrative health data from the UPDB included ICD diagnostic codes from Utah Department of Health in-patient and ambulatory surgery records and University of Utah and Intermountain Health electronic health records. For endometriosis diagnosis, we found relatively high sensitivity (0.88) and specificity (0.87) and substantial agreement (Kappa [Κ] = 0.74). We found similarly high sensitivity, specificity, and agreement for superficial endometriosis (n = 143, 0.86, 0.83, Κ = 0.65) and ovarian endometriomas (n = 38, 0.82, 0.92, Κ = 0.58). However, deep infiltrating endometriosis (n = 58) had lower sensitivity (0.12) and agreement (Κ = 0.17), with high specificity (0.99). Medication prescription data and unstructured data, such as clinical notes, were not included in the UPDB data used for this study. These additional data types could aid in detection of endometriosis. Most participants were white or Asian with Hispanic ethnicity reported 11% of the time, which may limit generalizability to some US states. Additionally, given that participants whose administrative health records we utilized were also part of the ENDO Study, the surgeons may have been more vigilant in diagnostic coding due to the operative forms they completed for the ENDO Study, which may have led to increased validity. However, the codes compared in the UPDB would have been entered by medical coders as part of standard clinical practice. We observed substantial agreement between administrative health data and surgically confirmed endometriosis diagnoses overall, and for superficial and ovarian endometrioma subtypes. These findings may provide reassurance to researchers using administrative healthcare records to assess risk factors and long-term health outcomes of endometriosis. Our findings corroborate prior research that demonstrates high specificity but low sensitivity for deep infiltrating endometriosis, indicating deep infiltrating endometriosis is not reliably annotated in administrative healthcare data. This suggests that medical record-based deep infiltrating endometriosis diagnoses may be suitable for etiologic studies but not for surveillance or detection studies. The original ENDO Study was funded by the Intramural Research Program, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health (contracts NO1-DK-6-3428; NO1-DK-6-3427; 10001406-02). We acknowledge partial support for the UPDB through grant P30 CA2014 from the National Cancer Institute, University of Utah and from the University of Utah's program in Personalized Health and Center for Clinical and Translational Science. This research was also supported by the NCRR grant, 'Sharing Statewide Health Data for Genetic Research' (R01 RR021746, G. Mineau, PI) with additional support from the Utah Department of Health and Human Services, University of Utah. Additionally, this research was supported by the Utah Cancer Registry, which is funded by the National Cancer Institute's SEER Program, Contract No. HHSN261201800016I, the US Centers for Disease Control and Prevention's National Program of Cancer Registries, Cooperative Agreement No. NU58DP007131, with additional support from the University of Utah and Huntsman Cancer Foundation. Research reported in this publication was also supported by the National Institutes of Health (Award Numbers R01HL164715 [to L.V.F., K.C.S., and A.Z.P.] and K01AG058781 [to K.C.S.]), by the Huntsman Cancer Institute's Breast and Gynecologic Cancers Center, and by the Doris Duke Foundation's COVID-19 Fund to Retain Clinical Scientists funded by the American Heart Association. A.C.K. was supported by Training Grant Number 5T15LM007124 from the National Library of Medicine to K.E. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or other sponsors. There are no competing interests among any of the authors. N/A.
Read full abstract