Convolutional Neural Networks Naively Trained on Radiation Oncology-Specific DATA Outperforms Classical Natural Language Processing Approaches for Automated Identification of Common Toxicity Terms

M.R Waters,K.H Kang,R.J Brenneman,D Caruthers,M.B Spraker,C.D Abraham

doi:10.1016/j.ijrobp.2021.07.164

Abstract

Electronic medical records (EMR) hold potential for transformative improvements in the quality and efficiency of healthcare delivery. However, the unstructured nature of EMR data often necessitates manual review and is a significant barrier to leveraging it for downstream analysis. Methods to address this challenge include natural language processing (NLP) algorithms such as tokenization, shallow parsing, and boundary/negation detection. Recent radiation oncology (RO) specific attempts to use NLP to structure EMR data have involved identifying common toxicity terms in on-treatment visit (OTV) notes. Classical NLP works well for positive toxicity term identification (i.e., toxicity present) but is suboptimal for negated symptoms (i.e., toxicity absent). Convoluted neural networks (CNN) aid context detection, however, no publicly available RO-specific CNNs which identify toxicity terms exist. We hypothesized that RO-specific CNNs, naively trained on OTV notes, could improve tabulation of positive and negative toxicity information to an accuracy level suitable for automation.OTV notes (n = 3789) for prostate cancer (PCa) patients treated at our institution between 2019-2021 were identified and analyzed for inclusion/exclusion/omission of CTCAE toxicity terms. The top 5 terms identified (fatigue, nausea, diarrhea, dysuria, hematuria) were used for further analysis. Each note was manually classified by explicit positive identification, negation, or omission of each toxicity term, and used to train an in-house, toxicity-term-specific CNNs. Algorithms were structured as 3-group multiclass classification problem to distinguish between positive or negative symptom identification and omission of the term. Gold standard accuracy measurements were determined using manual review scores of OTV notes for the presence, absence or omission of toxicity terms in each OTV note. Overall, out of sample accuracy and F1 score were determined using a test/train split of OTV notes.The 3-class accuracy of CNNs for the top 5 CTCAE terms present/absent/negated in PCa OTV notes were: fatigue (accuracy = 0.93, F1 = 0.95), diarrhea (accuracy = 0.95, F1 = 0.94), nausea (accuracy = 0.98, F1 = 1.0), dysuria (accuracy = 0.97, F1 = 0.97), hematuria (accuracy = 0.99, F1 = 0.96).Training naïve CNNs with RO-specific training data from OTV notes increased the accuracy of CTCAE toxicity coding. This approach addresses challenges previously encountered using classical NLP from RO EMR data. Therefore, use of CNNs in NLP may reduce barriers to implementation of automated methods to improve data extraction for retrospective and prospective analyses.

Full Text