Abstract

In this paper, the risk pattern of e-bike riders in China was examined, based on tree-structured machine learning techniques. Three-year crash/violation data were acquired from the Kunshan traffic police department, China. Firstly, high-risk (HR) electric bicycle (e-bike) riders were defined as those with at-fault crash involvement, while others (i.e., non-at-fault or without crash involvement) were considered as non-high-risk (NHR) riders, based on quasi-induced exposure theory. Then, for e-bike riders, their demographics and previous violation-related features were developed based on the crash/violation records. After that, a systematic machine learning (ML) framework was proposed so as to capture the complex risk patterns of those e-bike riders. An ensemble sampling method was selected to deal with the imbalanced datasets. Four tree-structured machine learning methods were compared, and a gradient boost decision tree (GBDT) appeared to be the best. The feature importance and partial dependence were further examined. Interesting findings include the following: (1) tree-structured ML models are able to capture complex risk patterns and interpret them properly; (2) spatial-temporal violation features were found as important indicators of high-risk e-bike riders; and (3) violation behavior features appeared to be more effective than violation punishment-related features, in terms of identifying high-risk e-bike riders. In general, the proposed ML framework is able to identify the complex crash risk pattern of e-bike riders. This paper provides useful insights for policy-makers and traffic practitioners regarding e-bike safety improvement in China.

Highlights

  • As a convenient, economical, energy-saving, and environmentally-friendly travel tool, electric bicycles (e-bikes) can meet the needs of people traveling short or medium distances, but they are be able to propel the development of the green sustainability concept

  • In orderapproach to reasonably the risk patterns of ae-bike riders, systematic machine learning-based wasidentify proposed to deal with number ofadata mining issues,learningincluding based approach was proposed to deal with a number of data mining issues, including imbalanced imbalanced datasets, machine learning model selection, hyper-parameter tuning, and model validation

  • The false positive rate was calculated as the ratio between the number of NHRs wrongly categorized as HR and the total number of actual NHRs

Read more

Summary

Introduction

Economical, energy-saving, and environmentally-friendly travel tool, electric bicycles (e-bikes) can meet the needs of people traveling short or medium distances, but they are be able to propel the development of the green sustainability concept. China has become the largest e-bike producer and consumer country around the world. E-bike ownership in China has sharply increased from 58 thousand to 29.96 million, from 1998 to. Experts predict that the ownership of e-bikes may be in a continuous growth condition with the technological innovation and policy improvement of e-bikes in the future [3]. Increased e-bike ownership brings a lot of challenges for traffic safety [4]. Yao claimed that the number of crashes reached nearly 56.2 thousand, with an approximately

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call