Abstract

In this paper, motivated by the production process of electronic control modules in the digital electronic detonators industry, we study a multi-objective flexible flow shop scheduling problem. The objective is to find a feasible schedule that minimizes both the makespan and the total tardiness. Considering the constraints imposed by the jobs and the machines throughout the manufacturing process, a mixed integer programming model is formulated. By transforming the scheduling problem into a Markov decision process, the agent state features and the actions are designed based on the processing status of the machines and the jobs, along with heuristic rules. Furthermore, a reward function based on the optimization objectives is designed. Based on the deep reinforcement learning algorithm, the Dueling Double Deep Q-Network (D3QN) algorithm is designed to solve the scheduling problem by incorporating the target network, the dueling network, and the experience replay buffer. The D3QN algorithm is compared with heuristic rules, the genetic algorithm (GA), and the optimal solutions generated by Gurobi. The ablation experiments are designed. The experimental results demonstrate the high performance of the D3QN algorithm with the target network and the dueling network proposed in this paper. The scheduling model and the algorithm proposed in this paper can provide theoretical support to make the production plan of electronic control modules reasonable and improve production efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call