A twin delayed deep deterministic policy gradient-based energy management strategy for a battery-ultracapacitor electric vehicle considering driving condition recognition with learning vector quantization neural network

Rui Liu,Chun Wang,Aihua Tang,Yongzhi Zhang,Quanqing Yu

doi:10.1016/j.est.2023.108147

Abstract

Deep reinforcement learning algorithms have been widely applied in the energy management of hybrid energy storage systems. However, these deep reinforcement learning algorithms, such as DQN and DDPG, have the problem of discontinuous action space and consistently overestimated Q values. To address this issue, a novel energy management strategy based on a twin delayed deep deterministic policy gradient (TD3) algorithm is proposed for the battery-ultracapacitor electric vehicles in this study. In addition, the driving condition recognition method is integrated into the energy management strategy framework to reduce the training time of the TD3 agent. The detailed implementation steps are as follows. At first, dynamic experiments were performed to establish high-precision models of the battery and ultracapacitor. Secondly, learning vector quantization neural networks are applied to classify driving conditions, namely, urban, suburban and highway conditions. Furthermore, three parallel TD3 agents are trained for urban, suburban and highway conditions, respectively. Finally, the proposed strategy is evaluated under standard driving cycles. The simulation results indicate that compared with the TD3-based strategy, the proposed strategy improves the economy by 1 % and reduces the training time by 34 %, and the economic gap with the dynamic programming-based energy management strategy is narrowed down to 3 %.

Full Text