As the increasing level of implementation of artificial intelligence technology in solving complex engineering optimization problems, various learning mechanisms, including deep learning (DL) and reinforcement learning (RL), have been developed for manufacturing scheduling. In this paper, a collaborative-learning multi-agent RL method (CL-MARL) is proposed for solving distributed hybrid flow-shop scheduling problem (DHFSP), minimizing both makespan and total energy consumption. First, the DHFSP is formulated as the Markov decision process, the features of machines and jobs are represented as state and observation matrixes according to their characteristics, the candidate operation set is used as action space, and a reward mechanism is designed based on the machine utilization. Next, a set of critic networks and actor networks, consist of recurrent neural networks and fully connected networks, are employed to map the states and observations into the output values. Then, a novel distance matching strategy is designed for each agent to select the most appropriate action at each scheduling step. Finally, the proposed CL-MARL model is trained through multi-agent deep deterministic policy gradient algorithm in collaborative-learning manner. The numerical results prove the effectiveness of the proposed multi-agent system, and the comparisons with existing algorithms demonstrate the high-potential of CL-MARL in solving DHFSP.
Read full abstract