Abstract
BACKGROUND CONTEXT Medical notes contain a rich supply of medical data, yet the format of unstructured text precludes this data from being readily used by computers for data mining. Using natural language processing (NLP) in combination with machine learning on standard operative notes may allow for efficient billing, maximization of collections and minimization of coder error. PURPOSE We hypothesize that a machine learning algorithm can accurately identify billing CPT codes on unstructured patient operative notes. STUDY DESIGN/SETTING This was a retrospective analysis of medical notes from a large, single-center academic institution's database comprised of cases from a single surgeon. PATIENT SAMPLE Inclusion criteria included patients who underwent elective spine surgery by a single senior surgeon from 9/2015 to 1/2020. OUTCOME MEASURES Algorithm performance was measured by performing receiver-operating characteristic (ROC) analysis and calculating the area under the ROC curve (AUC). METHODS Inclusion criteria included patients who underwent elective spine surgery by a single senior surgeon from 9/2015 to 1/2020. Algorithm performance was measured by performing receiver-operating characteristic (ROC) analysis and calculating the area under the ROC curve (AUC). The data was randomized with 70% used for training and 30% used for testing. Labels (CPT codes) were generated by the billing and coding department. NLP (natural language processing) techniques were used to analyze standard operative notes and train an algorithm to automatically generate CPT codes. A deep learning NLP algorithm (bidirectional long short-term memory network with attention) was tested on operative notes to predict CPT codes. CPT codes generated by the billing department were compared to those generated by our model. RESULTS A total of 391 operative dictations fit our inclusion criteria. The NLP algorithm identified the correct CPT codes on the validation set with a final accuracy of 98%. The overall AUC was 80% for the top 36 CPT Codes. The average class by class accuracy for the top 36 CPT Codes was 80%. CONCLUSIONS Combining NLP with ML is a valid approach for automatic generation of CPT billing codes, and it can be used by departments to allow for efficient billing, maximization of collections and minimization of coder error. FDA DEVICE/DRUG STATUS This abstract does not discuss or include any applicable devices or drugs.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have