Abstract

We analyzed commercial fleet operations data collected by a South Korean rent-a-car company, SK Networks Co. Ltd., to evaluate the differences between collision-free and collision-involved drivers with the ultimate goal of predicting driver collision risk. The first objective was to identify critical variables related to collision risk. The second objective was to build and compare classification models to predict the colli-sion involvement of a driver. Data used in the analysis were collected through Long-Range (LoRa) Internet of Things (IoT) modem-Fleet management system (FMS) devices, a first commercial implementation of LoRa modems in the vehicle. These devices have five main built-in modules, i.e., On-Board Diagnostics (OBD-II) Connector, GPS, LoRa modem, Gravity sensor, and Bluetooth. They can communicate with the vehi-cle, the driver’s smartphone, and the host server. Data from 3,854 drivers with a total of 2.19 million trips recorded in 2018 were explored. Out of these 3,854 drivers, 514 (13.3%) were involved in at least one collision. Predictor variables were selected based on previous research that uti-lized naturalistic data to identify factors affecting collision risk (Dingus et al., 2016; Tselentis, Yannis, & Vlahogianni, 2016; Bian, Yang, Zhao, & Liang, 2018; Jin et al., 2018). Forty-eight predictor variables that may affect collision risk were selected, which can be categorized into two groups: 27 variables were related to the business and the environment, characterized by how much drivers traveled, in what type of vehicles, on what types of roads, and during what times of the day; the other 22 variables were driving behavior-related variables, capturing overspeeding, potential fatigue, rapid speed changes, and counts of traffic regulation violations. After a feature selection phase based on univariate analysis, nine variables were select-ed to be used in the classification models. These selected vari-ables are running driving time (driving time excluding idling time), trip frequency per thousand kilometer driving, accumu-lated count of violation, accumulated amount of fine, the per-centage of trips driving a compact car (<1,000 cc), the per-centage of trips driving older than a 2016 car model, the per-centage of trips during 6 a.m.-9 a.m., the percentage of trips that ended during 2 a.m.-7 a.m., and the sum of rapid accelera-tion and deceleration frequencies per kilometer. A total of twenty classification models were built and compared to classify collision-involved and non-collision in-volved drivers: 5 classification modeling techniques (Logistic Regression (LR), Random Forest (RF), k-Nearest Neighbor (kNN), Support Vector Machine (SVM) and Gradient boosted trees (GBT)) x 4 sampling methods (Up, Down, Smote, and No-sampling). The GBT-down sampled model showed the best classification performance according to Area under the Curve (0.804) and Area under the Precision and Recall Curve (0.406) statistics. Comparing relative variable importance val-ues for the best three classification models (GBT, RF, and LR), both running driving time and violation count were found to be the most influential variables, followed by the sum of rapid acceleration and deceleration frequencies, accumulated amount of fine, trip frequency per thousand kilometer driving, and the percentage of trips driving a compact car. These re-sults agree with the results of previous naturalistic studies: driver behavior-related variables are highly related to collision likelihood, although running driving time in our dataset was likely dictated by businesses. This dataset provided us with a unique opportunity to take an in-depth look at the relationship between collisions and business, environment, and driving behavior-related variables by using naturalistic data from newly-invented LoRa IoT-FMS devices. To the best of our knowledge, this study was the first naturalistic study connecting both driving data and various types of traffic violations (e.g., overspeeding, lane, sign, park-ing, toll fees, and fine amount). Interestingly, non-driving-related violation types such as parking or toll-fee violation counts were also strongly correlated with collision involve-ment; suggesting that collision-involvement is likely not just a skill issue but also an attitude issue regarding the law. In terms of industrial applications, this study suggests multiple oppor-tunities. Through a better understanding of the influential vari-ables related to collision-involvement (e.g., accumulated vio-lations), fleet operators can build policies to enhance their fleet safety, reducing collision rates and the associated costs. Further, in the long-term, this study can provide a framework for developing a Usage-Based Rent-a-car (UBR) service for car rental field, similar to Usage-Based Insurance (UBI), which can reduce drivers’ rental fees based on their driving behaviors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call