Objective The injury severity classification based on the Abbreviated Injury Scale (AIS) provides information that allows for standardized comparisons for injury research. However, the majority of injury data is captured using the International Classification of Diseases (ICD), which lacks injury severity information. It has been shown that the encoder-decoder-based neural machine translation (NMT) model is more accurate than other methods for determining injury severity from ICD codes. The objectives of this project were to determine if feed-forward neural networks (FFNN) perform as well as NMT and to determine if direct estimation of injury severity is more accurate than using AIS codes as an intermediary (indirect method). Methods Patient data from the National Trauma Data Bank were used to develop and test the four models (NMT/Indirect, NMT/Direct, FFNN/Indirect, FFNN/Direct). There were 2,031,793 cases from 2017–2018 used to train and 1,091,792 cases from 2019 were used for testing. The primary outcome of interest was the percent of cases with the correct binary classification of Injury Severity Score (ISS) ≥16, using ISS values recorded in NTDB for benchmarking. The secondary outcome was the percent of predicted ISS exactly matching the recorded ISS. Results The results show that indirect estimation through first converting to AIS using an NMT was the most accurate in predicting ISS ≥ 16 (94.0%), followed by direct estimation with FFNN (93.4%), direct estimation with NMT (93.1%), and then indirect estimation with FFNN (93.1%), with statistically significant differences in pairwise comparison. The rankings were the same when evaluating models based on exactly matches of ISS. Training times were similar for all models (range 11–14 h), but testing was much faster for FFNN models (GPU: 1–2 min) compared to the NMT models (GPU: 69–82 min). Conclusions The most accurate method for obtaining injury severity from ICD was NMT using AIS codes as an intermediary (indirect method), although all methods performed well. The indirect NMT model was the most resource intensive in terms of processing time. The optimal approach for researchers will be based on their needs and the computing resources available.
Read full abstract