Abstract

This paper presents an effective paradigm to make full use of both Deep Reinforcement Learning (DRL) and expert knowledge to find an optimal control strategy. The paradigm consists of three parts: DRL, expert demonstrations, and behavior cloning. It is the first time that the proposed paradigm is used for suppressing tank sloshing with two active controlled horizontal baffles. Meanwhile, a self-developed computational fluid dynamics (CFD) solver is used to simulate the environment of tank sloshing. For direct DRL, both the proximal policy optimization agent and the twin delayed deep deterministic policy gradient agent are tested for performing learning. The strategies obtained by different algorithms may not be uniform even for the same environment. Then, we derive a simplified parametric control policy informed from direct DRL. Finally, DRL with behavior cloning is used to optimize the simplified parametric control policy. After training, the agent can actively control the baffles and reduce sloshing by ∼81.48%. The Fourier analysis of the surface elevations pinpoints that the aim of the control strategy obtained by DRL with behavior cloning is to disperse the wave energy and change the sloshing frequency of the tank through fast oscillation of baffles. This provides an idea to suppress sloshing, similar to forcing waves to disassemble ahead of time. The experience and insights gained from this study indicate that the future development direction of DRL + CFD is how to couple DRL, expert demonstrations, and behavior cloning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call