Abstract

Although several methods for short-term forecasting of traffic volume have recently been developed, the literature lacks studies that focus on how to choose the appropriate prediction method on the basis of the statistical characteristics of the data set. This study first diagnosed the predictability of four traffic volume data sets on the basis of various statistical measures, including (a) complexity analysis methods, such as the delay time and embedding dimension method and the approximate entropy method; (b) nonlinearity analysis methods, such as the time reversibility of surrogate data; and (c) long-range dependency analysis techniques, such as the Hurst exponent. After the data sets were diagnosed, three models for short-term prediction of traffic volume were applied: (a) seasonal autoregressive integrated moving average (SARIMA), (b) k nearest neighbor (k-NN), and (c) support vector regression (SVR). The results from the statistical data diagnosis methods were then correlated to the performance results of the three prediction methods on the four data sets to determine the means for choosing the appropriate prediction method. The results revealed that SVR was more suitable for nonlinear data sets, while SARIMA and k-NN were more appropriate for linear data sets. The data diagnosis results were also used to devise a selection process for the parameters of the prediction models, such as the length of the training data set for SARIMA and SVR, the average number of nearest neighbors for k-NN, and the input vector length for k-NN and SVR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call