Abstract

We propose two classes of algorithms for achieving the user equilibrium in simulation-based dynamic traffic assignment with special attention given to the interactions between travel information and route choice behavior. A driver is assumed to perform day-to-day route choice repeatedly and experience payoffs with unknown noise. The driver adaptively changes his/her action in accordance with the payoffs that are initially unknown and must be estimated over time due to noisy observations. To solve this problem, we develop a multi-agent version of Q-learning to estimate the payoff functions using novel forms of the e−greedy learning policy. We apply this Q-learning scheme for the simulation-based DTA in which traffic flow and travel times of routes in the traffic network are generated by a microscopic traffic simulator based on cellular automaton. Finally, we provide simulation examples to show convergence of our algorithms to Nash equilibrium and effectiveness of the best-route provision services.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.