A Bibliometric Analysis and Benchmark of Machine Learning and AutoML in Crash Severity Prediction: The Case Study of Three Colombian Cities.

Juan S Angarita-Zapata,Gina Maestre-Gongora,Jenny Fajardo Calderín

doi:10.3390/s21248401

Juan S Angarita-Zapata, Gina Maestre-Gongora + Show 1 more

Open Access

https://doi.org/10.3390/s21248401

Copy DOI

Abstract

Traffic accidents are of worldwide concern, as they are one of the leading causes of death globally. One policy designed to cope with them is the design and deployment of road safety systems. These aim to predict crashes based on historical records, provided by new Internet of Things (IoT) technologies, to enhance traffic flow management and promote safer roads. Increasing data availability has helped machine learning (ML) to address the prediction of crashes and their severity. The literature reports numerous contributions regarding survey papers, experimental comparisons of various techniques, and the design of new methods at the point where crash severity prediction (CSP) and ML converge. Despite such progress, and as far as we know, there are no comprehensive research articles that theoretically and practically approach the model selection problem (MSP) in CSP. Thus, this paper introduces a bibliometric analysis and experimental benchmark of ML and automated machine learning (AutoML) as a suitable approach to automatically address the MSP in CSP. Firstly, 2318 bibliographic references were consulted to identify relevant authors, trending topics, keywords evolution, and the most common ML methods used in related-case studies, which revealed an opportunity for the use AutoML in the transportation field. Then, we compared AutoML (AutoGluon, Auto-sklearn, TPOT) and ML (CatBoost, Decision Tree, Extra Trees, Gradient Boosting, Gaussian Naive Bayes, Light Gradient Boosting Machine, Random Forest) methods in three case studies using open data portals belonging to the cities of Medellín, Bogotá, and Bucaramanga in Colombia. Our experimentation reveals that AutoGluon and CatBoost are competitive and robust ML approaches to deal with various CSP problems. In addition, we concluded that general-purpose AutoML effectively supports the MSP in CSP without developing domain-focused AutoML methods for this supervised learning problem. Finally, based on the results obtained, we introduce challenges and research opportunities that the community should explore to enhance the contributions that ML and AutoML can bring to CSP and other transportation areas.

Highlights

We introduce the machine learning (ML) and automated machine learning (AutoML) methods chosen for the experimentation, the metric used to measure performance, and the statistical tests considered to assess the significance of the results
The above is justified because we aim to compare the performance of AutoML versus the baseline using the same human effort for both in order to carry out a fairer comparison
To compare the competitiveness and the significance of general-purpose AutoML versus ad hoc ML methods identified in the Bibliometric analysis in binary (Medellín) and multiclass (Bogotá, Bucaramanga) problems with different degrees of imbalanced data

Summary

Introduction

One worldwide challenge is designing and promoting policies to reduce traffic crashes, which are one of the leading causes of death and injuries worldwide [1]. In this sense, Intelligent Transportation Systems (ITS) and new Information and Communication Technologies (ICTs) (e.g., the Internet of Things) are crucial factors that can contribute to accomplishing such an aim [2,3]. We provide background information on AutoML and related work focused on other literature reviews and experimental comparisons at the point where CSP and ML meet.

Objectives

Methods

Results

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors (Basel, Switzerland)	Publication Date: Dec 16, 2021
Citations: 12	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Bibliometric Analysis and Benchmark of Machine Learning and AutoML in Crash Severity Prediction: The Case Study of Three Colombian Cities.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)

Lead the way for us

Similar Papers

Stiffness and Strength of Stabilized Organic Soils—Part II/II: Parametric Analysis and Modeling with Machine Learning
Negin Yousefpour ... Zenon Medina-Cetina
Geosciences | VOL. 11
Negin Yousefpour, et. al.Negin Yousefpour ... Zenon Medina-Cetina
17 May 2021
Geosciences | VOL. 11

Machine learning approach for predicting the antifungal effect of gilaburu (Viburnum opulus) fruit extracts on Fusarium spp. isolated from diseased potato tubers
Alper Zongur ... Mehmet Akif Buzpinar
Journal of microbiological methods | VOL. 192
Alper Zongur, et. al.Alper Zongur ... Mehmet Akif Buzpinar
19 Nov 2021
Journal of microbiological methods | VOL. 192

Prediction of dose deposition matrix using voxel features driven machine learning approach.
Shengxiu Jiao ... Shuzhan Yao
The British journal of radiology | VOL. 96
Shengxiu Jiao, et. al.Shengxiu Jiao ... Shuzhan Yao
06 Mar 2023
The British journal of radiology | VOL. 96

Context- and Physiology-aware Machine Learning for Upper-Limb Myocontrol
Gauravkumar K Patel
-
Gauravkumar K PatelGauravkumar K Patel
21 Feb 2022
21 Feb 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Bibliometric Analysis and Benchmark of Machine Learning and AutoML in Crash Severity Prediction: The Case Study of Three Colombian Cities.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)