Accurately monitoring passenger demand fluctuations is crucial for streamlined operations of subway systems and informed decision-making. This study presents a detailed Time Series Analysis of the Toronto subway system using Wi-Fi data connection from devices as a predictor of passenger volume. Various time series models were tested for short-term forecasting, including Linear Regression, Exponential Smoothing, ARIMA, Random Forest, N-BEATS, and T-GCN. An end-to-end modeling implementation process was carried out, and the performance of each model was evaluated. The primary objective was to assess the effectiveness of short-term prediction models for univariate time series at the system level and discuss deployment challenges. While conventional time series models are fast to implement and interpretable, they require a more in-depth data exploration phase for validation, making scaling at the system level difficult. Additionally, maintenance is more challenging with conventional models, and their exploratory analysis phases need to be repeated when the models degrade over time. Prediction difficulty varied across each subway station, indicating the need for a more thorough calibration or hybrid approach, especially for transfer stations. Despite the different uses and qualities of each model in our scenario, Random Forest and Exponential Smoothing emerged as the best performers and could be a satisfactory option for robust demand forecasting at the system level.
Read full abstract