Traffic prediction is essential for transportation planning, resource allocation, congestion management and enhancing travel experiences. This study optimizes data preprocessing techniques to improve machine learning-based traffic prediction models. Data preprocessing is critical in preparing the data for machine learning models. This study proposes an approach that optimizes data preprocessing techniques, focusing on flow-based analysis and optimization, to enhance traffic prediction models. The proposed approach explores fixed and variable orders of data preprocessing using a genetic algorithm across five diverse datasets. Evaluation metrics such as root mean squared error (RMSE), mean absolute error (MAE) and R-squared error assess model performance. The results indicate that the genetic algorithm’s variable order achieves the best performance for the ArcGIS Hub and Frementon Bridge Cycle datasets, fixed order one preprocessing for the Traffic Prediction dataset and variable order using the genetic algorithm for the PeMS08 dataset. Fixed order 2 preprocessing yields the best performance for the XI AN Traffic dataset. These findings highlight the importance of selecting the appropriate data preprocessing flow order for each dataset, improving traffic prediction accuracy and reliability. The proposed approach advances traffic prediction methodologies, enabling more precise and reliable traffic forecasts for transportation planning and management applications.
Read full abstract