Accurate forecasting of bus travel time and its uncertainty is critical to service quality and operation of transit systems: it can help passengers make informed decisions on departure time, route choice, and even transport mode choice, and it also support transit operators on tasks such as crew/vehicle scheduling and timetabling. However, most existing approaches in bus travel time forecasting are based on deterministic models that provide only point estimation. To this end, we develop in this paper a Bayesian probabilistic model for forecasting bus travel time and estimated time of arrival (ETA). To characterize the strong dependencies/interactions between consecutive buses, we concatenate the link travel time vectors and the headway vector from a pair of two adjacent buses as a new augmented variable and model it with a mixture of constrained multivariate Gaussian distributions. This approach can naturally capture the interactions between adjacent buses (e.g., correlated speed and smooth variation of headway), handle missing values in data, and depict the multimodality in bus travel time distributions. Next, we assume different periods in a day share the same set of Gaussian components, and we use time-varying mixing coefficients to characterize the systematic temporal variations in bus operation. For model inference, we develop an efficient Markov chain Monte Carlo (MCMC) algorithm to obtain the posterior distributions of model parameters and make probabilistic forecasting. We test the proposed model using the data from two bus lines in Guangzhou, China. Results show that our approach significantly outperforms baseline models that overlook bus-to-bus interactions, in terms of both predictive means and distributions. Besides forecasting, the parameters of the proposed model contain rich information for understanding/improving the bus service, for example, analyzing link travel time and headway correlation using covariance matrices and understanding time-varying patterns of bus fleet operation from the mixing coefficients. Funding: This research is supported in part by the Fonds de Recherche du Quebec-Societe et Culture (FRQSC) under the NSFC-FRQSC Research Program on Smart Cities and Big Data, the Canadian Statistical Sciences Institute (CANSSI) Collaborative Research Teams grants, and the Natural Sciences and Engineering Research Council (NSERC) of Canada. X. Chen acknowledges funding support from the China Scholarship Council (CSC). Supplemental Material: The e-companion is available at https://doi.org/10.1287/trsc.2022.0214 .
Read full abstract