A train trajectory optimization method based on the safety reinforcement learning with a relaxed dynamic reward

Ligang Cheng,Jie Cao,Xiaofeng Yang,Wenxian Wang,Zijian Zhou

doi:10.1007/s42452-024-06159-8

Abstract

Train trajectory optimization (TTO) is an effective way to address energy consumption in rail transit. Reinforcement learning (RL), an excellent optimization method, has been used to solve TTO problems. Although traditional RL algorithms use penalty functions to restrict the random exploration behavior of agents, they cannot fully guarantee the safety of the process and results. This paper proposes a proximal policy optimization based safety reinforcement learning framework (S-PPO) for the train trajectory optimization, including a safe action rechoosing mechanism (SARM) and a relaxed dynamic reward mechanism (RDRM) combining a relaxed sparse reward and a dynamic dense reward. SARM guarantees that the new states generated by the agent consistently adhere to the environmental security constraints, thereby enhancing sampling efficiency and facilitating algorithm convergence. RDRM makes it easier for agents to obtain successful samples by relaxing time constraints, which also offers a better balance between exploration and exploitation. The experimental results show that S-PPO can significantly improve performance and obtain better train operation trajectories than soft constraint methods, and the convergence process is smoother. Finally, it was demonstrated that S-PPO exhibits good adaptability across various speed limit tracks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A train trajectory optimization method based on the safety reinforcement learning with a relaxed dynamic reward

Abstract

Talk to us

Similar Papers

More From: Discover Applied Sciences

Lead the way for us

Journal: Discover Applied Sciences	Publication Date: Aug 30, 2024
License type: cc-by-nc-nd

Similar Papers

Energy-efficient train trajectory optimization based on improved differential evolution algorithm and multi-particle model
Deqiang He ... Hanqing Jian
Journal of Cleaner Production | VOL. 304
Deqiang He, et. al.Deqiang He ... Hanqing Jian
19 Apr 2021
Journal of Cleaner Production | VOL. 304

Intelligent Energy-Efficient Train Trajectory Optimization Approach Based on Supervised Reinforcement Learning for Urban Rail Transits
Guannan Li ... Siu Wing Or
IEEE Access | VOL. 11
Guannan Li, et. al.Guannan Li ... Siu Wing Or
01 Jan 2023
IEEE Access | VOL. 11

An unsupervised autonomous learning framework for goal-directed behaviours in dynamic contexts
Chinedu Pascal Ezenkwu ... Andrew Starkey
Advances in Computational Intelligence | VOL. 2
Chinedu Pascal Ezenkwu, et. al.Chinedu Pascal Ezenkwu ... Andrew Starkey
01 Jun 2022
Advances in Computational Intelligence | VOL. 2

Using case-based reasoning as a reinforcement learning framework for optimisation with changing criteria
Dajun Zeng ... K Sycara
-
Dajun Zeng, et. al. Dajun Zeng ... K Sycara
05 Nov 1995
05 Nov 1995

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A train trajectory optimization method based on the safety reinforcement learning with a relaxed dynamic reward

Abstract

Talk to us

Similar Papers

More From: Discover Applied Sciences