Shared communication for coordinated large-scale reinforcement learning control

Nicolas Bougie,Yoshimasa Tsuruoka,Takashi Onishi

doi:10.1080/18824889.2023.2174647

Abstract

Deep Reinforcement Learning (DRL) recently emerged as a possibility to control complex systems without the need to model them mathematically. In contrast to classical controllers, DRL alleviates the need for constant parameter tuning, tedious design of control laws, and re-identification procedures in the event of performance degradation. However, the application of DRL algorithms remains fairly modest, and they have not yet established a significant position in process industries. One major obstacle has been their sample inefficiency when facing tasks featuring large state-action spaces. In this work, we show that it is possible to use DRL for plant-wide control by decentralizing and coordinating reinforcement learning. Namely, we express the global policy as a collection of local policies. Every local policy receives local observations and is responsible for controlling a different region of the environment. To enable coordination among local policies, we present a mechanism based on message passing. Messages are encoded by a shared communication channel, which is equipped with a model-based stream to capture the dynamics of the system and enable effective pre-training. The proposed method is evaluated on a set of robotic tasks and a large-scale vinyl acetate monomer (VAM) plant. Experimental results highlight that the proposed model exhibits drastic improvements over baselines in terms of mean scores and sample efficiency.

Full Text