Simulator Pre-Screening of Underprepared Drivers Prior to Licensing On-Road Examination: Clustering of Virtual Driving Test Time Series Data.

David Grethlein,Sean Tanner,Flaura Koplin Winston,Elizabeth Walshe,Santiago Ontañón,Venk Kandadai

doi:10.2196/13995

David Grethlein, Sean Tanner + Show 4 more

Open Access

https://doi.org/10.2196/13995

Copy DOI

Abstract

BackgroundA large Midwestern state commissioned a virtual driving test (VDT) to assess driving skills preparedness before the on-road examination (ORE). Since July 2017, a pilot deployment of the VDT in state licensing centers (VDT pilot) has collected both VDT and ORE data from new license applicants with the aim of creating a scoring algorithm that could predict those who were underprepared.ObjectiveLeveraging data collected from the VDT pilot, this study aimed to develop and conduct an initial evaluation of a novel machine learning (ML)–based classifier using limited domain knowledge and minimal feature engineering to reliably predict applicant pass/fail on the ORE. Such methods, if proven useful, could be applicable to the classification of other time series data collected within medical and other settings.MethodsWe analyzed an initial dataset that comprised 4308 drivers who completed both the VDT and the ORE, in which 1096 (25.4%) drivers went on to fail the ORE. We studied 2 different approaches to constructing feature sets to use as input to ML algorithms: the standard method of reducing the time series data to a set of manually defined variables that summarize driving behavior and a novel approach using time series clustering. We then fed these representations into different ML algorithms to compare their ability to predict a driver’s ORE outcome (pass/fail).ResultsThe new method using time series clustering performed similarly compared with the standard method in terms of overall accuracy for predicting pass or fail outcome (76.1% vs 76.2%) and area under the curve (0.656 vs 0.682). However, the time series clustering slightly outperformed the standard method in differentially predicting failure on the ORE. The novel clustering method yielded a risk ratio for failure of 3.07 (95% CI 2.75-3.43), whereas the standard variables method yielded a risk ratio for failure of 2.68 (95% CI 2.41-2.99). In addition, the time series clustering method with logistic regression produced the lowest ratio of false alarms (those who were predicted to fail but went on to pass the ORE; 27.2%).ConclusionsOur results provide initial evidence that the clustering method is useful for feature construction in classification tasks involving time series data when resources are limited to create multiple, domain-relevant variables.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of medical Internet research	Publication Date: Jun 18, 2020
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

Simulator Pre-Screening of Underprepared Drivers Prior to Licensing On-Road Examination: Clustering of Virtual Driving Test Time Series Data.

Abstract

Talk to us

Similar Papers

More From: Journal of medical Internet research

Lead the way for us

Similar Papers

An efficient implementation of anytime k-medoids clustering for time series under dynamic time warping
Van The Huy ... Duong Tuan Anh
-
Van The Huy, et. al.Van The Huy ... Duong Tuan Anh
08 Dec 2016
08 Dec 2016

Incremental Clustering for Time Series Data Based on an Improved Leader Algorithm
Huynh Thi Thu Thuy ... Vo Thi Ngoc Chau
-
Huynh Thi Thu Thuy, et. al.Huynh Thi Thu Thuy ... Vo Thi Ngoc Chau
01 Mar 2019
01 Mar 2019

Feature-Based Clustering for Electricity Use Time Series Data
Teemu Räsänen ... Mikko Kolehmainen
-
Teemu Räsänen, et. al.Teemu Räsänen ... Mikko Kolehmainen
01 Jan 2009
01 Jan 2009

MDL-based time series clustering
Thanawin Rakthanmanon ... Eamonn J Keogh
Knowledge and Information Systems | VOL. 33
Thanawin Rakthanmanon, et. al.Thanawin Rakthanmanon ... Eamonn J Keogh
12 Jun 2012
Knowledge and Information Systems | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Simulator Pre-Screening of Underprepared Drivers Prior to Licensing On-Road Examination: Clustering of Virtual Driving Test Time Series Data.

Abstract

Talk to us

Similar Papers

More From: Journal of medical Internet research