Abstract
We propose a newmultiobjective control algorithm based on reinforcement learning for urban traffic signal control, namedmulti-RL. A multiagent structure is used to describe the traffic system. A vehicular ad hoc network is used for the data exchange among agents. A reinforcement learning algorithm is applied to predict the overall value of the optimization objective given vehicles' states. The policy which minimizes the cumulative value of the optimization objective is regarded as the optimal one. In order to make the method adaptive to various traffic conditions, we also introduce a multiobjective control scheme in which the optimization objective is selected adaptively to real-time traffic states. The optimization objectives include the vehicle stops, the average waiting time, and the maximum queue length of the next intersection. In addition, we also accommodate a priority control to the buses and the emergency vehicles through ourmodel. The simulation results indicated that our algorithm could performmore efficiently than traditional traffic light control methods.
Highlights
Increasing traffic congestion over the road networks makes the development of more intelligent and efficient traffic control systems an urgent and important requirement
This paper is organized as follows: in Section 2, we will introduce how to model the road network with an agentbased structure; Section 3 describes how to exchange traffic data using the ad hoc network; in Section 4, a multiagent traffic control strategy using reinforcement learning is proposed; in Section 5, the proposed method is applied to a road network with 7 intersections to prove its effectiveness; in Section 6, we draw the conclusion of this paper
A multiobjective control algorithm based on reinforcement learning is proposed
Summary
Increasing traffic congestion over the road networks makes the development of more intelligent and efficient traffic control systems an urgent and important requirement. Thorpe studied reinforcement learning for traffic light control in 1997 He used a neural network to predict the waiting time for all cars standing at the intersection and selected the best control policy using the SARSA algorithm [10]. Wiering et al utilized a car-based value function to solve this problem [13, 14] They predicted each car’s total expected waiting time until it arrived its destination given possible choices of related traffic lights using reinforcement learning, and chose the action which minimized the summed waiting time of all cars in the network. This method effectively reduces the state space and can be applied to large-network control. This paper is organized as follows: in Section 2, we will introduce how to model the road network with an agentbased structure; Section 3 describes how to exchange traffic data using the ad hoc network; in Section 4, a multiagent traffic control strategy using reinforcement learning is proposed; in Section 5, the proposed method is applied to a road network with 7 intersections to prove its effectiveness; in Section 6, we draw the conclusion of this paper
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.