Abstract

In this paper, we study how to learn a harmonious deep reinforcement learning (DRL) based lane-changing strategy for autonomous vehicles without Vehicle-to-Everything (V2X) communication support. The basic framework of this paper can be viewed as a multi-agent reinforcement learning in which different agents will exchange their strategies after each round of learning to reach a zero-sum game state. Unlike cooperation driving, harmonious driving only relies on individual vehicles’ limited sensing results to balance overall and individual efficiency. Specifically, we propose a well-designed reward that combines individual efficiency with overall efficiency for harmony, instead of only emphasizing individual interests like competitive strategy. Testing results show that competitive strategy often leads to selfish lane change behaviors, anarchy of crowd, and thus the degeneration of traffic efficiency. In contrast, the proposed harmonious strategy can promote traffic efficiency in both free flow and traffic jam than the competitive strategy. This interesting finding indicates that we should take care of the reward setting for reinforcement learning-based AI robots (e.g., automated vehicles) design, when the utilities of these robots are not strictly in alignment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call