Natural language processing of head CT reports to identify intracranial mass effect: CTIME algorithm.

Alexandra June Gordon,Tsuyoshi Mitarai,Jason Block,Imon Banerjee,Daniel L Rubin,Jennifer G Wilson,Christopher Winstead-Derlega,Michael Jarrett,Michael A Kohn,Josh Sanyal,Max Wintermark

doi:10.1016/j.ajem.2021.11.001

Abstract

The Mortality Probability Model (MPM) is used in research and quality improvement to adjust for severity of illness and can also inform triage decisions. However, a limitation for its automated use or application is that it includes the variable "intracranial mass effect" (IME), which requires human engagement with the electronic health record (EHR). We developed and tested a natural language processing (NLP) algorithm to identify IME from CT head reports. We obtained initial CT head reports from adult patients who were admitted to the ICU from our ED between 10/2013 and 9/2016. Each head CT head report was labeled yes/no IME by at least two of five independent labelers. The reports were then randomly divided 80/20 into training and test sets. All reports were preprocessed to remove linguistic and style variability, and a dictionary was created to map similar common terms. We tested three vectorization strategies: Term Frequency-Inverse Document frequency (TF-IDF), Word2Vec, and Universal Sentence Encoder to convert the report text to a numerical vector. This vector served as the input to a classification-tree-based ensemble machine learning algorithm (XGBoost). After training, model performance was assessed in the test set using the area under the receiver operating characteristic curve (AUROC). We also divided the continuous range of scores into positive/inconclusive/negative categories for IME. Of the 1202 CT reports in the training set, 308 (25.6%) reports were manually labeled as "yes" for IME. Of the 355 reports in the test set, 108 (30.4%) were labeled as "yes" for IME. The TF-IDF vectorization strategy as an input for the XGBoost model had the best AUROC:-- 0.9625 (95% CI 0.9443-0.9807). TF-IDF score categories were defined and had the following likelihood ratios: "positive" (TF-IDF score>0.5) LR=24.59; "inconclusive" (TF-IDF 0.05-0.5) LR=0.99; and "negative" (TF-IDF<0.05) LR=0.05. 82% of reports were classified as either "positive" or "negative". In the test set, only 4 of 199 (2.0%) reports with a "negative" classification were false negatives and only 8 of 93 (8.6%) reports classified as "positive" were false positives. NLP can accurately identify IME from free-text reports of head CTs in approximately 80% of records, adequate to allow automatic calculation of MPM based on EHR data for many applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Natural language processing of head CT reports to identify intracranial mass effect: CTIME algorithm.

Abstract

Talk to us

Similar Papers

More From: The American Journal of Emergency Medicine

Lead the way for us

Journal: The American Journal of Emergency Medicine	Publication Date: Jan 1, 2022
Citations: 7

Similar Papers

Deep learning-based classification and mutation prediction from histopathological images of hepatocellular carcinoma.
Haotian Liao ... Ruijiang Han
Clinical and Translational Medicine | VOL. 10
Haotian Liao, et. al.Haotian Liao ... Ruijiang Han
01 Jun 2020
Clinical and Translational Medicine | VOL. 10

Random forest estimation of genomic breeding values for disease susceptibility over different disease incidences and genomic architectures in simulated cow calibration groups
S Naderi ... S König
Journal of Dairy Science | VOL. 99
S Naderi, et. al.S Naderi ... S König
22 Jun 2016
Journal of Dairy Science | VOL. 99

Developing and validating natural language processing algorithms for radiology reports compared to ICD-10 codes for identifying venous thromboembolism in hospitalized medical patients
Amol A Verma ... Fahad Razak
Thrombosis Research | VOL. 209
Amol A Verma, et. al.Amol A Verma ... Fahad Razak
27 Nov 2021
Thrombosis Research | VOL. 209

Prediction of Microsatellite Instability in Colorectal Cancer Using a Machine Learning Model Based on PET/CT Radiomics.
Soyoung Kim ... Jae-Hoon Lee
Yonsei Medical Journal | VOL. 64
Soyoung Kim, et. al.Soyoung Kim ... Jae-Hoon Lee
01 Jan 2023
Yonsei Medical Journal | VOL. 64

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Natural language processing of head CT reports to identify intracranial mass effect: CTIME algorithm.

Abstract

Talk to us

Similar Papers

More From: The American Journal of Emergency Medicine