Applying Markov decision process to adaptive dynamic route selection model

Ali Edrisi,Ali Nadi,Koosha Bagherzadeh

doi:10.1680/jtran.19.00085

Abstract

Routing technologies have long been available in many automobiles and smart phones, but the nearly random nature of traffic on road networks has always encouraged further efforts to improve the reliability of navigation systems. Given the networks' uncertainty, an adaptive dynamic route selection model based on reinforcement learning is proposed. In the proposed method, the Markov decision process (MDP) is used to train simulated agents in a network so that they are able to make independent decisions under random conditions and, accordingly, determine the set of routes with the shortest travel time. The aim of the research was to integrate the MDP with a multi-nomial logit model (a widely used stochastic discrete-choice model) to improve finding the stochastic shortest path by computing the probability of selecting an arc from several interconnected arcs based on observations made at the arc location. The proposed model, tested with real data from part of the road network in Isfahan, Iran, and the results obtained demonstrated its good performance under 100 randomly applied stochastic scenarios.

Full Text