Abstract

ObjectiveCohort selection is challenging for large-scale electronic health record (EHR) analyses, as International Classification of Diseases 9th edition (ICD-9) diagnostic codes are notoriously unreliable disease predictors. Our objective was to develop, evaluate, and validate an automated algorithm for determining an Autism Spectrum Disorder (ASD) patient cohort from EHR. We demonstrate its utility via the largest investigation to date of the co-occurrence patterns of medical comorbidities in ASD.MethodsWe extracted ICD-9 codes and concepts derived from the clinical notes. A gold standard patient set was labeled by clinicians at Boston Children’s Hospital (BCH) (N = 150) and Cincinnati Children’s Hospital and Medical Center (CCHMC) (N = 152). Two algorithms were created: (1) rule-based implementing the ASD criteria from Diagnostic and Statistical Manual of Mental Diseases 4th edition, (2) predictive classifier. The positive predictive values (PPV) achieved by these algorithms were compared to an ICD-9 code baseline. We clustered the patients based on grouped ICD-9 codes and evaluated subgroups.ResultsThe rule-based algorithm produced the best PPV: (a) BCH: 0.885 vs. 0.273 (baseline); (b) CCHMC: 0.840 vs. 0.645 (baseline); (c) combined: 0.864 vs. 0.460 (baseline). A validation at Children’s Hospital of Philadelphia yielded 0.848 (PPV). Clustering analyses of comorbidities on the three-site large cohort (N = 20,658 ASD patients) identified psychiatric, developmental, and seizure disorder clusters.ConclusionsIn a large cross-institutional cohort, co-occurrence patterns of comorbidities in ASDs provide further hypothetical evidence for distinct courses in ASD. The proposed automated algorithms for cohort selection open avenues for other large-scale EHR studies and individualized treatment of ASD.

Highlights

  • With the prevalence of Autism Spectrum Disorders (ASD) at 1 in 68 children under the age of 8 years,[1] understanding the distinct clinical courses among patients with ASD is of great clinical relevance

  • The initial cohort consisted of all patients with International Classification of Diseases 9th edition (ICD-9) diagnosis codes of 299.0, 299.80, 299.9 (Autism, Asperger’s, Pervasive Developmental Disorder Not Otherwise Specified (PDD-NOS), respectively) from the Boston Children’s Hospital (BCH) and Cincinnati Children’s Hospital Medical Center (CCHMC) electronic health record (EHR) databases (14,758 and 4,229 patients, respectively) (Fig 1, ICD-9 Inclusion)

  • One limitation of the current data set is that the area under the receiver operator curve (AUC) values on the test set for either Cincinnati Children’s Hospital and Medical Center (CCHMC) or BCH did not exceed 0.55 in Stage 1 experiments, possibly indicating that the development set for one site is not large enough to provide a representative sample to examine a model

Read more

Summary

Objective

Cohort selection is challenging for large-scale electronic health record (EHR) analyses, as International Classification of Diseases 9th edition (ICD-9) diagnostic codes are notoriously unreliable disease predictors. Our objective was to develop, evaluate, and validate an automated algorithm for determining an Autism Spectrum Disorder (ASD) patient cohort from EHR. We demonstrate its utility via the largest investigation to date of the co-occurrence patterns of medical comorbidities in ASD. Human Genome Research Institute (NHGRI) through the following grants: U01HG006828 (Cincinnati Children’s Hospital Medical Center/Boston Children’s Hospital); U01HG006830 (Children’s Hospital of Philadelphia); U01HG006378 (Vanderbilt University Medical Center) and U01HG008666 (Cincinnati Children’s Hospital Medical Center) This does not alter the authors' adherence to PLOS ONE policies on sharing data and materials

Methods
Results
Conclusions
Introduction
Patients and Methods
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.