Assembly flowshops differ from flowshops because their jobs have several components that are manufactured in the processing stage and then joined together in the assembly stage. Real-time scheduling of two-stage assembly flowshops is of great significance for many discrete manufacturing companies in customized order fulfilment. Therefore, we study two-stage assembly flowshop scheduling problem with the objective of total tardiness minimization considering dynamic job arrivals using deep reinforcement learning (DRL). First, an architecture of DRL-based real-time scheduling is proposed. Second, we establish a DRL-based real-time scheduling model by designing proper state space, action space and reward function. A scheduling agent is constructed and trained through the proximal policy optimization (PPO) using interactive simulation data. The experimental results show that the trained scheduling agent can learn to choose the appropriate dispatching rules efficiently so as to make a real-time scheduling quickly and effectively. The PPO algorithm outperforms six dispatching rules, genetic algorithm and two DRL algorithms (i.e., DDQN and DDPG) in terms of solution quality and computing speed. The proposed approach is highly applicable to many practical scheduling situations that requires real-time and quick decisions.
Read full abstract