Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm

Jingpeng Gan,Yuansheng Liu,Jiancheng Zhang

doi:10.3390/app14072889

Jingpeng Gan, Yuansheng Liu + Show 1 more

Open Access

https://doi.org/10.3390/app14072889

Copy DOI

Journal: Applied Sciences	Publication Date: Mar 29, 2024
License type: CC BY 4.0

Affiliation: Beijing Union University

Abstract

Unsignalized roundabouts have a significant impact on traffic flow and vehicle safety. To address the challenge of autonomous vehicles passing through roundabouts with low penetration, improve their efficiency, and ensure safety and stability, we propose the proximal policy optimization (PPO) algorithm to enhance decision-making behavior. We develop an optimization-based behavioral choice model for autonomous vehicles that incorporates gap acceptance theory and deep reinforcement learning using the PPO algorithm. Additionally, we employ the CoordConv network to establish an aerial view for spatial perception information gathering. Furthermore, a dynamic multi-objective reward mechanism is introduced to maximize the PPO algorithm’s reward pool function while quantifying environmental factors. Through simulation experiments, we demonstrate that our optimized PPO algorithm significantly improves training efficiency by enhancing the reward value function by 2.85%, 7.17%, and 19.58% in scenarios with 20, 100, and 200 social vehicles, respectively, compared to the PPO+CCMR algorithm. The effectiveness of simulation training also increases by 11.1%, 13.8%, and 7.4%. Moreover, there is a reduction in crossing time by 2.37%, 2.62%, and 13.96%. Our optimized PPO algorithm enhances path selection during autonomous vehicle simulation training as they tend to drive in the inner ring over time; however, the influence of social vehicles on path selection diminishes as their quantity increases. The safety of autonomous vehicles remains largely unaffected by our optimized PPO algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm

Abstract

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Implementing action mask in proximal policy optimization (PPO) algorithm
Cheng-Yen Tang ... Chien-Hung Liu
ICT Express | VOL. 6
Cheng-Yen Tang, et. al.Cheng-Yen Tang ... Chien-Hung Liu
20 May 2020
ICT Express | VOL. 6

Multiple-UAV Reinforcement Learning Algorithm Based on Improved PPO in Ray Framework
Guang Zhan ... Xinmiao Zhang
Drones | VOL. 6
Guang Zhan, et. al.Guang Zhan ... Xinmiao Zhang
04 Jul 2022
Drones | VOL. 6

Application of Deep Reinforcement Learning in Guandan Game
Jiahong Pan ... Zhongtian Zhang
-
Jiahong Pan, et. al.Jiahong Pan ... Zhongtian Zhang
15 Aug 2022
15 Aug 2022

Proximal policy optimization via enhanced exploration efficiency
Junwei Zhang ... Shuai Lü
Information Sciences | VOL. 609
Junwei Zhang, et. al.Junwei Zhang ... Shuai Lü
25 Jul 2022
Information Sciences | VOL. 609

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm

Abstract

Talk to us

Similar Papers

More From: Applied Sciences