In the semiconductor manufacturing industry, the Dynamic Flexible Job Shop Scheduling Problem is regarded as one of the most complex and significant scheduling problems. Existing studies consider the dynamic arrival of jobs, however, the insertion of urgent jobs such as testing chips poses a challenge to the production model, and there is an urgent need for new scheduling methods to improve the dynamic response and self-adjustment of the shop floor. In this work, deep reinforcement learning is utilized to address the dynamic flexible job shop scheduling problem and facilitate near-real-time shop floor decision-making. We extracted eight state features, including machine utilization, operation completion rate, etc., to reflect real-time shop floor production data. After examining machine availability time, the machine's earliest available time is redefined and incorporated into the design of compound scheduling rules. Eight compound scheduling rules have been developed for job selection and machine allocation. By using the state features as inputs to the Double Deep Q-Network, it is possible to acquire the state action values (Q-values) of each compound scheduling rule, and the intelligent agent can learn a reasonable optimization strategy through training. Simulation studies show that the proposed Double Deep Q-Network algorithm outperforms other heuristics and well-known scheduling rules by generating excellent solutions quickly. In most scenarios, the Double Deep Q-Network algorithm outperforms the Deep Q-Network, Q-Learning, and State-Action-Reward-State-Action (SARSA) frameworks. Moreover, the intelligent agent has good generalization ability in terms of optimization for similar objectives.
Read full abstract