Flexible and Robust Method for Missing Loop Detector Data Imputation

Kristian Henrickson,Yajie Zou,Yinhai Wang

doi:10.3141/2527-04

Abstract

This study is primarily focused on missing traffic sensor data imputation for the purpose of improving the coverage and accuracy of traffic analysis and performance estimation. Missing data, whether attributable to hardware failure or error detection and removal, are a constant problem in loop and other traffic detector data sets. As the rate of missingness increases, the treatment of missing values quickly becomes the controlling factor in overall data quality. Previously, several imputation approaches have been developed for traffic data. However, few studies aim at handling the traffic data with large blocks of missing values for networkwide implementation. A proven predictive mean matching multiple imputation method is introduced; it was applied to loop detector volume data collected on Interstate 5 in Washington State. With the use of the iterative multiple imputation by chained equations approach, the spatial correlation between nearby detectors was considered for prediction, and the presence of missing data was effectively dealt with in all predictors. The proposed methodology is shown to perform well on a range of missing data patterns, including missing completely at random, missing days, and missing months. After the imputation method was applied to 20-s data and postimputation aggregation was performed, the results in this study suggest that the proposed method can outperform elementary pairwise regression and produce reliable imputation estimates, even when entire days and months are missing from the data set. Thus the predictive mean matching multiple imputation method can be used as an effective approach for imputing missing traffic data in a range of challenging scenarios.

Full Text