Abstract

Traffic crashes are a critical safety concern. Many studies have attempted to improve traffic safety by performing a wide range of studies on safety topics with the application of diverse statistical and machine learning models. The data elements contained in police-reported crash narrative information are not routinely analyzed with coded and structured crash data. In the recent years, unstructured textual contents in traffic crash narratives have been investigated by many researchers. However, most of these studies are basic text mining applications and often the dataset is limited in size. This study applied an advanced language model Bidirectional Encoder Representations from Transformers (BERT) to classify traffic injury types by using a dataset of over 750,000 unique crash narrative reports. The models have an 84.2% ±0.5 predictive accuracy and an Area Under the receiver operating Curve (AUC) of 0.93 ± 0.06 per class. Overall, the findings can assist safety engineers and analysts in determining the causes of a crash. The classification of crash injury types using a language model like BERT is a valuable tool for identifying additional factors that contribute to crashes, which can identify new areas for safety countermeasures and support the development of new safety strategies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call