Abstract

Identifying high-risk drivers before an accident happens is necessary for traffic accident control and prevention. Due to the class-imbalance nature of driving data, high-risk samples as the minority class are usually ill-treated by standard classification algorithms. Instead of applying preset sampling or cost-sensitive learning, this paper proposes a novel automated machine learning framework that simultaneously and automatically searches for the optimal sampling, cost-sensitive loss function, and probability calibration to handle class-imbalance problem in recognition of risky drivers. The hyperparameters that control sampling ratio and class weight, along with other hyperparameters, are optimized by Bayesian optimization. To demonstrate the performance of the proposed automated learning framework, we establish a risky driver recognition model as a case study, using video-extracted vehicle trajectory data of 2427 private cars on a German highway. Based on rear-end collision risk evaluation, only 4.29% of all drivers are labeled as risky drivers. The inputs of the recognition model are the discrete Fourier transform coefficients of target vehicle’s longitudinal speed, lateral speed, and the gap between the target vehicle and its preceding vehicle. Among 12 sampling methods, 2 cost-sensitive loss functions, and 2 probability calibration methods, the result of automated machine learning is consistent with manual searching but much more computation-efficient. We find that the combination of Support Vector Machine-based Synthetic Minority Oversampling TEchnique (SVMSMOTE) sampling, cost-sensitive cross-entropy loss function, and isotonic regression can significantly improve the recognition ability and reduce the error of predicted probability.

Highlights

  • According to the statistics of historical accidents in road traffic, risky driving behavior is the leading cause of traffic insecurity [1]

  • This paper aims to build an Automated machine learning (AutoML) framework that can automatically select the best sampling method, cost-sensitive loss function, probability calibration method, and the corresponding hyperparameters to establish a risky driver recognition model

  • We propose an AutoML framework that automatically and simultaneously selects the sampling method, sampling ratio, cost-sensitive loss function, minority class weight, and probability calibration method to maximize the evaluation metrics of risky driving recognition model

Read more

Summary

Introduction

According to the statistics of historical accidents in road traffic, risky driving behavior is the leading cause of traffic insecurity [1]. The quantification and identification of risky driving behaviors and risky drivers are crucial for road traffic safety. Most research on risky driving and risky driver recognition algorithms focuses on risky driving state recognition, including aggressive driving, distracted driving, fatigue driving, etc. Wang et al [2] used discrete Fourier coefficients of vehicle trajectory data, such as distance between vehicles and speed, as input and used imbalanced class boosting algorithms to identify aggressive car-following drivers. Liu et al [4] conducted a natural driving experiment and used the driver’s eye movement and hand movement data to establish a semi-supervised learning model for distracted driving. Chandrasiri et al [5] used a driving simulator to observe the driver’s

Objectives
Methods
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call