This study presents a Two-Layer Deep Deterministic Policy Gradient (TL-DDPG) energy management strategy for Hydrogen fuel cell hybrid train, that aims to solve the problem that traditional reinforcement learning strategies require high initial values and are difficult to optimize global variables. Augmenting the optimization capabilities of the inner layer, a frequency decoupling algorithm integrates into the outer layer, furnishing a fitting initial value for strategy optimization. This addition aims to bolster the stability of fuel cell output, thereby enhancing the overall efficiency of the hybrid power system. In comparison with the traditional reinforcement learning algorithm, the proposed approach demonstrates notable improvements: a reduction in hydrogen consumption per 100 km by 16.3 kg, a 9.7% increase in the output power stability of the fuel cell, and a 1.8% enhancement in its efficiency.
Read full abstract