Abstract

We propose two classes of algorithms for achieving the user equilibrium in simulation-based dynamic traffic assignment with special attention given to the interactions between travel information and route choice behavior. A driver is assumed to perform day-to-day route choice repeatedly and experience payoffs with unknown noise. The driver adaptively changes his/her action in accordance with the payoffs that are initially unknown and must be estimated over time due to noisy observations. To solve this problem, we develop a multi-agent version of Q-learning to estimate the payoff functions using novel forms of the e−greedy learning policy. We apply this Q-learning scheme for the simulation-based DTA in which traffic flow and travel times of routes in the traffic network are generated by a microscopic traffic simulator based on cellular automaton. Finally, we provide simulation examples to show convergence of our algorithms to Nash equilibrium and effectiveness of the best-route provision services.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call