Abstract

In injury surveillance, different aspects of an injury event are captured using injury codes such as the External-cause-of-injury (E-code), Major Injury Factor (MIF), and Intent. These are usually assigned by human coders based on accident narratives. Previous studies have examined automated and semi-automated filtering approaches that use machine learning (ML) models to assign single E-codes to accident narratives. In this study, our goal was to examine the effectiveness of these approaches for assigning groups of injury codes. This was done for three different types of injury codes (E-code, MIF, and Intent) using several ML models (Logistic Regression, Support Vector Machine, and Long-Short-Term-Memory based Recurrent-Neural-Network). Four filtering strategies were also tested which used the probability of prediction correctness assigned by the Logistic Regression model. These approaches were evaluated for a manually-coded dataset, provided by the Queensland Injury Surveillance Unit containing about half a million injury cases. The results showed very similar performance for the three ML models. The overall sensitivity of each model was quite high and almost identical for E-code (0.81–0.82), MIF (0.69–0.71), and Intent (0.96–0.97). However, the unweighted sensitivities were lower - E-code (0.67–0.75), MIF (0.59–0.62), and Intent (0.46–0.56), reflecting a general trend of each model to under-predict small categories. It was also observed that the probability of correctly assigning all three codes was low (0.58). Filtering approaches resulted in large improvements in sensitivity for smaller categories and the probability of predicting all three codes correctly.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.