An Adaptive Model-Free Control Method for Metro Train Based on Deep Reinforcement Learning

Wenzhu Lai,Dewang Chen,Benzun Huang,Yunhu Huang

doi:10.1007/978-3-031-20738-9_31

Abstract

AbstractThe current metro train control system has achieved automatic operation, but the degree of intelligence needs to be enhanced. To improve the intelligence of train driving, this paper adopts the proximal policy optimization (PPO) algorithm to study the intelligent train operation (ITO) of metro trains by drawing on the successful application of deep reinforcement learning in games. We propose an adaptive model-free control (MFAC) method for train speed profile tracking, named as intelligent train operation based on PPO (ITOP), and design reinforcement learning policies, actions, and rewards to ensure the accuracy of the train tracking speed profile, passenger comfort, and stopping accuracy. Simulation experiments are conducted using real railroad data from the Yizhuang Line of Beijing Metro (YLBS). The results show that the tracking curve generated by ITOP is highly coincident with the target curve with good parking accuracy and comfort, and responds positively to the changes of the target curve during the operation. This provides a new solution for the intelligent control of trains.KeywordsIntelligent train operationModel free adaptive controlDeep reinforcement learning

Full Text