Deep reinforcement learning based preventive maintenance policy for serial production lines

Jing Huang,Qing Chang,Jorge Arinez

doi:10.1016/j.eswa.2020.113701

Jing Huang, Qing Chang + Show 1 more

Open Access

https://doi.org/10.1016/j.eswa.2020.113701

Copy DOI

Abstract

In the manufacturing industry, the preventive maintenance (PM) is a common practice to reduce random machine failures by replacing/repairing the aged machines or parts. The decision on when and where the preventive maintenance needs to be carried out is nontrivial due to the complex and stochastic nature of a serial production line with intermediate buffers. In order to improve the cost efficiency of the serial production lines, a deep reinforcement learning based approach is proposed to obtain PM policy. A novel modeling method for the serial production line is adopted during the learning process. A reward function is proposed based on the system production loss evaluation. The algorithm based on the Double Deep Q-Network is applied to learn the PM policy. Using the simulation study, the learning algorithm is proved effective in delivering PM policy that leads to an increased throughput and reduced cost. Interestingly, the learned policy is found to frequently conduct “group maintenance” and “opportunistic maintenance”, although their concepts and rules are not provided during the learning process. This finding further demonstrates that the problem formulation, the proposed algorithm and the reward function setting in this paper are effective.

Full Text