Modeling Pedestrian Injury Severity: A Case Study of Using Extreme Gradient Boosting Vs Random Forest in Feature Selection

Zhenxi Wu,Aditi Misra,Shan Bao

doi:10.1177/03611981231170014

Abstract

Walking and bicycling are lauded for their negative net carbon impact and for their health benefits. However, national crash statistics suggest that pedestrians are disproportionately harmed in any vehicle–pedestrian conflict situation. Although automated transportation in the future is anticipated to increase overall safety, multiple incidents involving automated vehicles have been reported recently, indicating that the technology needs more training on real-world scenarios and conflicts. This research is motivated by the need for contextual data and related levels of harm in potential conflict scenarios in mixed traffic and we use a national police reported crash dataset, CRSS, to address this need. Our study uses a new gradient boosting algorithm, XGBoost, to identify important features among a host of seemingly significant variables. We compare the performance of XGBoost with the more frequently used random forest method and find that XGBoost is more reliable and robust for handling an unbalanced and sparse dataset like crash data, and the features extracted are more aligned to findings from previous research on the topic. We also compare feature importance between NASS-GES and CRSS—two national crash databases with different sampling strategies but the same objective—and find that sampling strategy influences selection of feature importance. We further use the features extracted using XGBoost in a multiclass logistic regression to quantify the effect of these features on different levels of pedestrian injury. Our findings indicate that speed limit, light conditions, pre-crash movements, and location of pedestrian are important contributors to crash severity, along with driver distraction and impairment.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Modeling Pedestrian Injury Severity: A Case Study of Using Extreme Gradient Boosting Vs Random Forest in Feature Selection

Abstract

Talk to us

Similar Papers

More From: Transportation Research Record: Journal of the Transportation Research Board

Lead the way for us

Journal: Transportation Research Record: Journal of the Transportation Research Board	Publication Date: May 15, 2023
Citations: 1

Similar Papers

Feature selection to increase the random forest method performance on high dimensional data
Maria Irmina Prasetiyowati ... Kridanto Surendro
International Journal of Advances in Intelligent Informatics | VOL. 6
Maria Irmina Prasetiyowati, et. al.Maria Irmina Prasetiyowati ... Kridanto Surendro
06 Nov 2020
International Journal of Advances in Intelligent Informatics | VOL. 6

Using ordered and unordered logistic regressions to investigate risk factors associated with pedestrian crash injury severity in Victoria, Australia
Kayvan Aghabayk ... Arsalan Esmaili
Journal of Safety Research | VOL. 81
Kayvan Aghabayk, et. al.Kayvan Aghabayk ... Arsalan Esmaili
08 Feb 2022
Journal of Safety Research | VOL. 81

Random forest and decision tree algorithms for car price prediction
Bister Purba ... Azanuddin Azanuddin
Jurnal Matematika Dan Ilmu Pengetahuan Alam LLDikti Wilayah 1 (JUMPA) | VOL. 3
Bister Purba, et. al.Bister Purba ... Azanuddin Azanuddin
26 Apr 2023
Jurnal Matematika Dan Ilmu Pengetahuan Alam LLDikti Wilayah 1 (JUMPA) | VOL. 3

Application of the Random Forest Method in Studies of Local Lymph Node Assay Based Skin Sensitization Data
Adam Fedorowicz ... Harshinder Singh
Journal of Chemical Information and Modeling | VOL. 45
Adam Fedorowicz, et. al.Adam Fedorowicz ... Harshinder Singh
21 Apr 2005
Journal of Chemical Information and Modeling | VOL. 45

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Modeling Pedestrian Injury Severity: A Case Study of Using Extreme Gradient Boosting Vs Random Forest in Feature Selection

Abstract

Talk to us

Similar Papers

More From: Transportation Research Record: Journal of the Transportation Research Board