Abstract

ObjectiveRadiology is a finite health care resource in high demand at most health centers. However, anticipating fluctuations in demand is a challenge because of the inherent uncertainty in disease prognosis. The aim of this study was to explore the potential of natural language processing (NLP) to predict downstream radiology resource utilization in patients undergoing surveillance for hepatocellular carcinoma (HCC). Materials and MethodsAll HCC surveillance CT examinations performed at our institution from January 1, 2010, to October 31, 2017 were selected from our departmental radiology information system. We used open source NLP and machine learning software to parse radiology report text into bag-of-words and term frequency–inverse document frequency (TF-IDF) representations. Three machine learning models—logistic regression, support vector machine (SVM), and random forest—were used to predict future utilization of radiology department resources. A test data set was used to calculate accuracy, sensitivity, and specificity in addition to the area under the curve (AUC). ResultsAs a group, the bag-of-word models were slightly inferior to the TF-IDF feature extraction approach. The TF-IDF + SVM model outperformed all other models with an accuracy of 92%, a sensitivity of 83%, and a specificity of 96%, with an AUC of 0.971. ConclusionsNLP-based models can accurately predict downstream radiology resource utilization from narrative HCC surveillance reports and has potential for translation to health care management where it may improve decision making, reduce costs, and broaden access to care.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call