Ensemble Learning with Pre-Trained Transformers for Crash Severity Classification: A Deep NLP Approach

Shadi Jaradat,Alexander Paz,Mohammed Elhenawy,Richi Nayak

doi:10.3390/a17070284

Abstract

Transfer learning has gained significant traction in natural language processing due to the emergence of state-of-the-art pre-trained language models (PLMs). Unlike traditional word embedding methods such as TF-IDF and Word2Vec, PLMs are context-dependent and outperform conventional techniques when fine-tuned for specific tasks. This paper proposes an innovative hard voting classifier to enhance crash severity classification by combining machine learning and deep learning models with various word embedding techniques, including BERT, RoBERTa, Word2Vec, and TF-IDF. Our study involves two comprehensive experiments using motorists’ crash data from the Missouri State Highway Patrol. The first experiment evaluates the performance of three machine learning models—XGBoost (XGB), random forest (RF), and naive Bayes (NB)—paired with TF-IDF, Word2Vec, and BERT feature extraction techniques. Additionally, BERT and RoBERTa are fine-tuned with a Bidirectional Long Short-Term Memory (Bi-LSTM) classification model. All models are initially evaluated on the original dataset. The second experiment repeats the evaluation using an augmented dataset to address the severe data imbalance. The results from the original dataset show strong performance for all models in the “Fatal” and “Personal Injury” classes but a poor classification of the minority “Property Damage” class. In the augmented dataset, while the models continued to excel with the majority classes, only XGB/TFIDF and BERT-LSTM showed improved performance for the minority class. The ensemble model outperformed individual models in both datasets, achieving an F1 score of 99% for “Fatal” and “Personal Injury” and 62% for “Property Damage” on the augmented dataset. These findings suggest that ensemble models, combined with data augmentation, are highly effective for crash severity classification and potentially other textual classification tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithms	Publication Date: Jun 30, 2024
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Ensemble Learning with Pre-Trained Transformers for Crash Severity Classification: A Deep NLP Approach

Abstract

Talk to us

Similar Papers

More From: Algorithms

Lead the way for us

Similar Papers

Deep learning‐based smishing message identification using regular expression feature generation
Aakanksha Sharaff ... Vrihas Pathak
Expert Systems | VOL. 40
Aakanksha Sharaff, et. al.Aakanksha Sharaff ... Vrihas Pathak
05 Oct 2022
Expert Systems | VOL. 40

Using AraGPT and ensemble deep learning model for sentiment analysis on Arabic imbalanced dataset
Nassera Habbat ... A Elamri
ITM Web of Conferences | VOL. 52
Nassera Habbat, et. al.Nassera Habbat ... A Elamri
01 Jan 2023
ITM Web of Conferences | VOL. 52

Automatic Fault Delineation in 3-D Seismic Images With Deep Learning: Data Augmentation or Ensemble Learning?
Shizhen Li ... Jicai Ding
IEEE Transactions on Geoscience and Remote Sensing | VOL. 60
Shizhen Li, et. al.Shizhen Li ... Jicai Ding
01 Jan 2021
IEEE Transactions on Geoscience and Remote Sensing | VOL. 60

Physics-Guided Data Augmentation Combined with Unsupervised Learning Improves Stability and Accuracy of Bit Wear Deep Learning Model
Huang Xu ... Guodong David Zhan
-
Huang Xu, et. al.Huang Xu ... Guodong David Zhan
27 Feb 2024
27 Feb 2024

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ensemble Learning with Pre-Trained Transformers for Crash Severity Classification: A Deep NLP Approach

Abstract

Talk to us

Similar Papers

More From: Algorithms