Abstract

In this article, an adaptive Proportion integration (PI) controller for varying the output voltage of a proton exchange membrane fuel cell (PEMFC) is proposed. The PI controller operates on the basis of ambient intelligence large-scale deep reinforcement learning. It functions as a coefficient tuner based on an ambient intelligence exploration multi-delay deep deterministic policy gradient (AIEM-DDPG) algorithm. This algorithm is an improvement on the original deep deterministic police gradient (DDPG) algorithm, which incorporates ambient intelligence exploration. The DDPG algorithm serves as the core, and the AIEM-DDPG algorithm runs on a variety of deep reinforcement learning algorithms, including soft actor-critic (SAC), deep deterministic policy gradient (DDPG), proximal policy optimization (PPO) and double deep Q-network (DDQN) algorithms, to attain distributed exploration in the environment. In addition, a classified priority experience replay mechanism is introduced to improve the exploration efficiency. Clipping multi-Q learning, policy delayed updating, target policy smooth regularization and other methods are utilized to solve the problem of Q-value overestimation. A model-free algorithm with good global searching ability and optimization speed is demonstrated. Simulation results show that the AIEM-DDPG adaptive PI controller attains better robustness and adaptability, as well as a good control effect.

Highlights

  • In recognition of the serious environmental pollution caused by conventional fuels, many countries have invested into R&D on new energy fuels [1], [2]

  • deep deterministic police gradient (DDPG) is applied in various control fields [41]–[43], the algorithm is affected by the common problems associated with many deep reinforcement learning algorithms: it requires long time off-line training before practical application, and it cannot be generalized to every environment when training is insufficient; these lead to poor robustness whenever this algorithm is employed for decision-making

  • AIEM-DDPG algorithm is used as the tuner of the adaptive Proportion integration (PI) controller, and the coefficients of the PI controller are regulated in real time by the tuner

Read more

Summary

INTRODUCTION

In recognition of the serious environmental pollution caused by conventional fuels, many countries have invested into R&D on new energy fuels [1], [2]. DDPG is applied in various control fields [41]–[43], the algorithm is affected by the common problems associated with many deep reinforcement learning algorithms: it requires long time off-line training before practical application, and it cannot be generalized to every environment when training is insufficient; these lead to poor robustness whenever this algorithm is employed for decision-making. This controller capitalizes on the excellent sensing ability and decision-making ability of AIEM-DDPG algorithm It can actively regulate the coefficients of PID control according to the system state, so that it can regulate the hydrogen flow of anode in real time in order to control the output voltage. The role of the DC/DC converter is not considered in the model, so the output voltage is equal to the stack voltage

HYDROGEN FLOW RATE OF PEMFC
PEMFC OUTPUT VOLTAGE CONTROL PRINCIPLE
COMMON POLICY GRADIENT ALGORITHMS
AIEM-DDPG
STATE SPACE
SIMULATION
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call