Abstract
Parallel machine scheduling with sequence-dependent family setups has attracted much attention from academia and industry due to its practical applications. In a real-world manufacturing system, however, solving the scheduling problem becomes challenging since it is required to address urgent and frequent changes in demand and due-dates of products. To minimize the total tardiness of the scheduling problem, we propose a deep reinforcement learning (RL) based scheduling framework in which trained neural networks (NNs) are able to solve unseen scheduling problems without re-training even when such changes occur. Specifically, we propose state and action representations whose dimensions are independent of production requirements and due-dates of jobs while accommodating family setups. At the same time, an NN architecture with parameter sharing was utilized to improve the training efficiency. Extensive experiments demonstrate that the proposed method outperforms the recent metaheuristics, rule-based, and other RL-based methods in terms of total tardiness. Moreover, the computation time for obtaining a schedule by our framework is shorter than those of the metaheuristics and other RL-based methods.
Highlights
As the competition among enterprises intensifies, production scheduling becomes one of the essential decision-making problems in modern manufacturing systems
We focus on the unrelated parallel machine scheduling problem (UPMSP) with sequence-dependent family setup time (SDFST), which has attracted a great deal of attention in various domains such as semiconductor [3]–[5], chemical [6], and food industries [7]
In this paper, we proposed a deep reinforcement learning (DRL)-based method for solving UPMSPs with SDFST constraint to minimize the total tardiness
Summary
As the competition among enterprises intensifies, production scheduling becomes one of the essential decision-making problems in modern manufacturing systems. Since learning complexity grows quickly as the numbers of jobs and machines increase, it is intractable to re-train a DNN whenever such variabilities occur in large-scale manufacturing systems To this end, we propose a DRL-based method for minimizing tardiness for UPMSP with SDFST to address the above challenges. For solving UPMSPs by utilizing RL-based methods, Zhang et al adopted QL to minimize the weighted tardiness [37], [38] They employed a linear basis function to approximate Q-values for given state features indicating the status. Yuan et al [39], [40] addressed ready time constraints and machine breakdown for minimizing the total tardiness and number of tardy jobs, respectively They adopted a tabular method that stores Q-values by exploring state-action pairs.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.