Abstract
Deep reinforcement learning algorithms have been widely applied in the energy management of hybrid energy storage systems. However, these deep reinforcement learning algorithms, such as DQN and DDPG, have the problem of discontinuous action space and consistently overestimated Q values. To address this issue, a novel energy management strategy based on a twin delayed deep deterministic policy gradient (TD3) algorithm is proposed for the battery-ultracapacitor electric vehicles in this study. In addition, the driving condition recognition method is integrated into the energy management strategy framework to reduce the training time of the TD3 agent. The detailed implementation steps are as follows. At first, dynamic experiments were performed to establish high-precision models of the battery and ultracapacitor. Secondly, learning vector quantization neural networks are applied to classify driving conditions, namely, urban, suburban and highway conditions. Furthermore, three parallel TD3 agents are trained for urban, suburban and highway conditions, respectively. Finally, the proposed strategy is evaluated under standard driving cycles. The simulation results indicate that compared with the TD3-based strategy, the proposed strategy improves the economy by 1 % and reduces the training time by 34 %, and the economic gap with the dynamic programming-based energy management strategy is narrowed down to 3 %.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.