Reinforcement learning (RL) has emerged as a promising approach for scheduling semiconductor operations. Yet, it is still challenging to solve large-scale scheduling problems based on an RL method since learning complexity grows fast as the size of shop floor increases. This challenge becomes more apparent when solving the scheduling problems with a diverse number of job types, which leads to the difficulties in exploration and function approximation in RL. This article presents a scheduling method for semiconductor packaging facilities using deep RL in which an agent allocates a job to one of machines in a centralized manner. Specifically, a novel state representation is introduced to effectively accommodate the variations in the number of available machines and the production requirements. Furthermore, we propose a continuous representation of an action to maintain the size of the action space even when the numbers of jobs, machines, and operation types are subject to change. Extensive experiments on large-scale datasets demonstrate that the proposed method mostly outperforms the metaheuristics and rule-based methods, as well as the other RL approaches considered in terms of makespan while requiring much less computation time than the metaheuristics.
Read full abstract