Abstract

This article is aimed at developing a control strategy based on the Q-learning algorithm for HEVs. The Q-learning algorithm deals with high-dimensional state space problems, and the agent will have a “dimension disaster” problem during the training process. Then a control strategy based on the Deep Q Network (DQN) algorithm is introduced. Since DQN can only output discrete actions, in order to achieve continuous action control, an optimized control strategy based on the Deep Deterministic Policy Gradient (DDPG) algorithm is proposed. Simulation results show that compared with Q-learning and DQN algorithms, the DDPG algorithm converges faster, and the training process is more robust. Besides, the energy optimization control strategy based on the DDPG algorithm can better control the energy of HEVs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call