Abstract

Many predictive models for estimating clinical outcomes after spine surgery have been reported in the literature. However, implementation of predictive scores in practice is limited by the time-intensive nature of manually abstracting relevant predictors. In this study, we designed natural language processing (NLP) algorithms to automate data abstraction for the thoracolumbar injury classification score (TLICS). We retrieved the radiology reports of all Mayo Clinic patients with an International Classification of Diseases, 9th or 10th revision, code corresponding to a fracture of the thoracolumbar spine between January 2005 and October 2020. Annotated data were used to train an N-gram NLP model using machine learning methods, including random forest, stepwise linear discriminant analysis, k-nearest neighbors, and penalized logistic regression models. A total of 1085 spine radiology reports were included in our analysis. Our dataset included 483 compression, 401 burst, 103 translational/rotational, and 98 distraction fractures. A total of 103 reports had documented an injury of the posterior ligamentous complex. The overall accuracy of the random forest model for fracture morphology feature detection was 76.96% versus 65.90% in the stepwise linear discriminant analysis, 50.69% in the k-nearest neighbors, and 62.67% in the penalized logistic regression. The overall accuracy to detect posterior ligamentous complex integrity was highest in the random forest model at 83.41%. Our random forest model was implemented in the backend of a web application in which users can dictate reports and have TLICS features automatically extracted. We have developed a machine learning NLP model for extracting TLICS features from radiology reports, which we deployed in a web application that can be integrated into clinical practice.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call