Abstract

Stop-and-go traffic poses significant challenges to the efficiency and safety of traffic operations. In this study, a cooperative longitudinal control based on Soft Actor Critic (SAC) Reinforcement Learning (RL) is proposed to address this issue. The reward function is carefully designed to consider vehicle cooperation and to achieve three main objectives: safety, efficiency, and oscillation dampening. A global performance metric for oscillation dampening is proposed to evaluate the developed RL and other baseline models. Depending on the number of preceding vehicles that can share maneuver information, two models RL-1 and RL-2 are proposed and compared with human driven (HD) and an adaptive cruise control (ACC) model using the HighD and simulated data. It is found that with information from additional preceding vehicles, RL-2 can dampen shockwaves more efficiently. Specifically, RL-1 and RL-2 decrease traffic oscillation by 15%-36% and 15%-42%, respectively, while HD amplifies the oscillation by 14–37%. The ACC model can also dampen shockwaves but is not as effective as RL-1 and RL-2. The two RL control methods are further evaluated based on data collected using a commercial Model X vehicle. Compared with the commercial Model X ACC vehicle in some controlled settings, the proposed RL methods can better dampen the stop-and-go waves by generating smaller oscillation growth, overshooting, and average acceleration/deceleration rate change, suggesting that they can generalize well in a new but similar environment. Finally, the RL methods are evaluated considering a platoon of vehicles with different RL penetration rates. The results show that they consistently outperform HD and ACC in dampening shockwaves.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call