The existing scheduling methods of wafer fabs focus on single area, achieving local optimisation while failing to realise global optimisation due to neglecting the coordination of multi-area. Therefore, it is necessary to consider the complex opposing relationships between multi-area caused by constraints such as batch processing, re-entrance, and multiple residency times within and between areas to conduct integrated scheduling and shorten the production cycle time. For this issue, this paper proposes a cooperative multi-agent reinforcement learning for multi-area integrated scheduling. Aiming at the dynamic batching and scheduling considering the dynamic arrival lots in multi-area, a multi-agent reinforcement learning algorithm is presented to learn the optimal dynamic batching and scheduling policy firstly. Subsequently, a cooperative multi-agent framework is raised to achieve the global optimisation and coordination of multi-area. Furthermore, an adaptive exploration strategy is constructed to enhance the global exploration capability of the complex solution space caused by residency time constraints and re-entrant property. Moreover, a policy share enhanced Double DQN is employed to improve the generalisation and adaptability of the multi-agent. Finally, the experiments demonstrate that the proposed integrated scheduling method has better comprehensive performance compared to the previous area-separated scheduling methods.
Read full abstract