Application of a Deep Deterministic Policy Gradient Algorithm for Energy-Aimed Timetable Rescheduling Problem

Guang Yang,Cheng Gong,Shiwen Zhang,Feng Zhang

doi:10.3390/en12183461

Abstract

Reinforcement learning has potential in the area of intelligent transportation due to its generality and real-time feature. The Q-learning algorithm, which is an early proposed algorithm, has its own merits to solve the train timetable rescheduling (TTR) problem. However, it has shortage in two aspects: Dimensional limits of action and a slow convergence rate. In this paper, a deep deterministic policy gradient (DDPG) algorithm is applied to solve the energy-aimed train timetable rescheduling (ETTR) problem. This algorithm belongs to reinforcement learning, which fulfills real-time requirements of the ETTR problem, and has adaptability on random disturbances. Superior to the Q-learning, DDPG has a continuous state space and action space. After enough training, the learning agent based on DDPG takes proper action by adjusting the cruising speed and the dwelling time continuously for each train in a metro network when random disturbances happen. Although training needs an iteration for thousands of episodes, the policy decision during each testing episode takes a very short time. Models for the metro network, based on a real case of the Shanghai Metro Line 1, are established as a training and testing environment. To validate the energy-saving effect and the real-time feature of the proposed algorithm, four experiments are designed and conducted. Compared with the no action strategy, results show that the proposed algorithm has real-time performance, and saves a significant percentage of energy under random disturbances.

Highlights

Nowadays, artificial intelligence (AI) has successfully been used for understanding human speech [1,2], competing at a high level in strategic game systems, self-driving vehicles [6,7], and interpreting complex data [8,9]
Reinforcement learning (RL) [10,11], which is a vital branch of AI, has potential in the area of intelligent transportation
There are two advantages of RL: First, due to its generality, agents can effectively study many disciplines in a complex environment such as the metro network [12,13,14]; second, an agent with full exploration of the environment can give proper decisions in real-time, which means that RL can be used in optimization problems with real-time requirements

Summary

Introduction

Artificial intelligence (AI) has successfully been used for understanding human speech [1,2], competing at a high level in strategic game systems (such as Chess [3] and Go [4,5]), self-driving vehicles [6,7], and interpreting complex data [8,9]. Algorithm based on RL to calculate optimal decisions, which minimize the reward for both total time-delay and energy-consumption. Both literatures are based on the Q-learning algorithm [21,22] belonging to RL. DDPG is applied to solve the ETTR problem This algorithm is a model-free, off-policy actor-critic algorithm using deep function approximators that can learn policies in high-dimensional, continuous action spaces [23]. It is successfully applied in fields such as robotic control [27] and traffic light timing optimization [28].

Principles of Deep Deterministic Policy Gradient

Model of Train Traffic

Model of Energy Consumption

Model of Train Movement

Relation

Environment and Agent

Action

Rewards

Experimental

10. Energy

11. Rewards

Conclusions

16. Energy

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Energies	Publication Date: Sep 7, 2019
Citations: 22	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Application of a Deep Deterministic Policy Gradient Algorithm for Energy-Aimed Timetable Rescheduling Problem

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Energies

Lead the way for us

Similar Papers

A State-Compensated Deep Deterministic Policy Gradient Algorithm for UAV Trajectory Tracking
Jiying Wu ... Naifeng He
Machines | VOL. 10
Jiying Wu, et. al.Jiying Wu ... Naifeng He
21 Jun 2022
Machines | VOL. 10

UAV maneuvering decision -making algorithm based on Twin Delayed Deep Deterministic Policy Gradient Algorithm
Shuangxia Bai ... Jianmei Wang
Journal of Artificial Intelligence and Technology | VOL. -
Shuangxia Bai, et. al.Shuangxia Bai ... Jianmei Wang
07 Dec 2021
Journal of Artificial Intelligence and Technology | VOL. -

Deep Deterministic Policy Gradient-based Parameter Selection Method of Notch Filters for Suppressing Mechanical Resonance in Industrial Servo Systems
Tae-Ho Oh ... Young-Seok Kim
-
Tae-Ho Oh, et. al.Tae-Ho Oh ... Young-Seok Kim
01 Aug 2019
01 Aug 2019

Deep Deterministic Policy Gradient Algorithm Based on Convolutional Block Attention for Autonomous Driving
Yanliang Jin ... Qianhong Liu
Symmetry | VOL. 13
Yanliang Jin, et. al.Yanliang Jin ... Qianhong Liu
12 Jun 2021
Symmetry | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Application of a Deep Deterministic Policy Gradient Algorithm for Energy-Aimed Timetable Rescheduling Problem

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Energies