Energy management strategy (EMS) is the key to the performance of fuel cell / battery hybrid system. At present, reinforcement learning (RL) has been introduced into this field and has gradually become the focus of research. However, traditional EMSs only take the energy consumption into consideration when optimizing the operation economy, and ignore the cost caused by power source degradations. It would cause the problem of poor operation economy regarding Total Cost of Ownership (TCO). On the other hand, most studied RL algorithms have the disadvantages of overestimation and improper way of restricting battery SOC, which would lead to relatively poor control performance as well. To solve these problems, this paper establishes a TCO model including energy consumption, equivalent energy consumption and degradation of power sources at first, then adopt the Double Q-learning RL algorithm with state constraint and variable action space to determine the optimal EMS. Finally, using hardware-in-the-loop platform, the feasibility, superiority and generalization of proposed EMS is proved by comparing with the optimal dynamic programming and traditional RL EMS and equivalent consumption minimum strategy (ECMS) under both training and unknown operating conditions. Results prove that the proposed strategy has high global optimality and excellent SOC control ability regardless of training or unknown conditions.
Read full abstract