The flexible job shop scheduling problem (FJSSP), which can significantly enhance production efficiency, is a mathematical optimization problem widely applied in modern manufacturing industries. However, due to its NP-hard nature, finding an optimal solution for all scenarios within a reasonable time frame faces serious challenges. This paper proposes a solution that transforms the FJSSP into a Markov Decision Process (MDP) and employs deep reinforcement learning (DRL) techniques for resolution. First, we represent the state features of the scheduling environment using seven feature vectors and utilize a transformer encoder as a feature extraction module to effectively capture the relationships between state features and enhance representation capability. Second, based on the features of the jobs and machines, we design 16 composite dispatching rules from multiple dimensions, including the job completion rate, processing time, waiting time, and manufacturing resource utilization, to achieve flexible and efficient scheduling decisions. Furthermore, we project an intuitive and dense reward function with the objective of minimizing the total idle time of machines. Finally, to verify the performance and feasibility of the algorithm, we evaluate the proposed policy model on the Brandimarte, Hurink, and Dauzere datasets. Our experimental results demonstrate that the proposed framework consistently outperforms traditional dispatching rules, surpasses metaheuristic methods on larger-scale instances, and exceeds the performance of existing DRL-based scheduling methods across most datasets.
Read full abstract