Abstract

Traffic data plays a very important role in Intelligent Transportation Systems (ITS). ITS requires complete traffic data in transportation control, management, guidance, and evaluation. However, the traffic data collected from many different types of sensors often includes missing data due to sensor damage or data transmission error, which affects the effectiveness and reliability of ITS. In order to ensure the quality and integrity of traffic flow data, it is very important to propose a satisfying data imputation method. However, most of the existing imputation methods cannot fully consider the impact of sensor data with data missing and the spatiotemporal correlation characteristics of traffic flow on imputation results. In this paper, a traffic data imputation method is proposed based on improved low-rank matrix decomposition (ILRMD), which fully considers the influence of missing data and effectively utilizes the spatiotemporal correlation characteristics among traffic data. The proposed method uses not only the traffic data around the sensor including missing data, but also the sensor data with data missing. The information of missing data is reflected into the coefficient matrix, and the spatiotemporal correlation characteristics are applied in order to obtain more accurate imputation results. The real traffic data collected from the Caltrans Performance Measurement System (PeMS) are used to evaluate the imputation performance of the proposed method. Experiment results show that the average imputation accuracy with proposed method can be improved 87.07% compared with the SVR, ARIMA, KNN, DBN-SVR, WNN, and traditional MC methods, and it is an effective method for data imputation.

Highlights

  • With the rapid development of the social economy, many kinds of the massive road infrastructure are implemented [1,2,3,4] but traffic congestion still exists in the highway

  • The missing data generally can be divided into three different types: Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing at Determinate (MAD)

  • The data used to evaluate the performance of the proposed model was collected in mainline detectors provided by the Performance Measurement System (PeMS) database, which includes more than 39,000 individual sensors that span the highway system in all major metropolitan areas of California

Read more

Summary

Introduction

With the rapid development of the social economy, many kinds of the massive road infrastructure are implemented [1,2,3,4] but traffic congestion still exists in the highway. Chen et al [24] proposed an Autoregressive Integrated Moving Average with Generalized Autoregressive Conditional Heteroscedasticity (ARIMA-GARCH) model for traffic flow prediction These prediction methods failed to utilize the sensor information with missing data, which would affect data imputation accuracy. Pattern-neighboring methods use the similarity characteristics of the daily traffic flow data [27] and estimate missing data using historical data collected from the same sensors on different days [17, 20]. The historical imputation methods fill the missing data with the known data point collected on the same sensors at the same daily time but from different days These methods require higher stability of historical data, but traffic flow data is usually unstable and fluctuate to some extent in practical applications.

Related Work
Traditional Imputation Method with LRMD
Traffic Data Imputation with ILRMD
Experiment Results
Results and Performances Analysis
Conclusions and Recommendations
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call