Abstract

Coronary artery disease (CAD) is a critical risk factor and toxicity endpoint in thoracic radiation oncology. The recent advent of artificial intelligence (AI) algorithms - specifically natural language processing (NLP) - allows for automated extraction of this data from electronic health records (EHR) more efficiently and at scale. We report on the application and performance characteristics of an EHR-based automated phenotyping tool to a cohort of lung cancer patients receiving radiation therapy (RT). We applied to this radiation oncology use-case an AI-based disease phenotyping tool developed by other researchers for phenotyping CAD from the EHR. The CAD model was trained on 191,187 patients with at least one ICD code and validated with 158 gold-standard patients (Sensitivity: 73%, Specificity: 93%, PPV: 90%). NLP disease definitions were determined by UMLS and were employed with ICD codes through PheNorm classification method to create the phenotype algorithm (Yue et al., J Am Med Inform Assoc. 2018 Jan 1; 25(1):54-60). We applied the CAD phenotyping tool to two independent cohorts of RT treated lung cancer patients: 1) an annotated set of 677 patients who received RT from 1998 to 2014 and were studied in prior cardiac toxicity research; 2) an independent set of 209 patients who received RT from 1998 to 2018 and were annotated separately. Only 7.7% of the AI training population had a CAD diagnosis code, compared to 32.5% and 65.0% in the 677 patient and 209 patient cohorts, respectively. Validation of the AI predicted CAD is shown in the table below. Performance on the 209 patient cohort best matched the AI training set. The 677-patient cohort differed from the 209-patient cohort in that its CAD definition included patients with radiographic coronary artery calcifications, which accounted for the difference in cohort sensitivities, F-scores, and NPVs. The AI calculated CAD was somewhat accurate in detecting whether a patient had CAD and was very accurate in confirming no CAD. The differences in sensitivity and NPV reflect differences in defining CAD, but this study demonstrates the potential for automated AI-phenotyping to detect pre-treatment risk factors and cardiac toxicity events from the EHR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call