Unplanned breakdown of critical equipment interrupts production throughput in Industrial IoT (IIoT), and data-driven Predictive Maintenance (PdM) becomes increasingly important for companies seeking a competitive business advantage. Manufacturers, however, are constantly faced with the onerous challenge of manually allocating suitably competent manpower resources in the event of an unexpected machine breakdown. Furthermore, human error has a negative rippling impact on both overall equipment downtime and production schedules. In this paper, we formulate the complex resource management problem as a resource optimisation problem to determine if a model-free Deep Reinforcement Learning (DRL) based PdM framework can be used to automatically learn an optimal decision-policy from a stochastic environment. Unlike the existing PdM frameworks, our approach considers PdM sensor information and the resources of both physical equipment and human as part of the optimisation problem. The proposed DRL-based framework and Proximal Policy Optimisation Long Short Term Memory (PPO-LSTM) model are evaluated alongside baselines results from human participants using a maintenance repair simulator. Empirical results indicate that our PPO-LSTM efficiently learns the optimal decision-policy for the resource management problem, outperforming comparable DRL methods and human participants by 53% and 65% respectively. Overall, the simulation results corroborate the proposed DRL-based PdM framework’s superiority in terms of convergence efficiency, simulation performance and flexibility.
Read full abstract