Post-prognostics demand management, production, spare parts and maintenance planning for a single-machine system using Reinforcement Learning

Kevin Wesendrup,Bernd Hellingrath

doi:10.1016/j.cie.2023.109216

Abstract

Production Planning and Control (PPC) is crucial for any manufacturer and comprises steps such as demand management, production, or source planning. Manufacturers achieve competitive advantage by sustaining continuous production, which can be realised through Condition-based Maintenance and Prognostics and Health Management. Hereby, the machine’s health can be predicted, and post-prognostics decision-making allows to optimise PPC to meet customer demands and minimise costs. Unfortunately, the complex dynamic, stochastic and intransparent nature of post-prognostics PPC makes it intractable to use ‘traditional’ static or deterministic optimisation techniques or approaches that require an exact mathematical model or objective function. To tackle this, a data-driven post-prognostics Reinforcement Learning model is developed to plan and control the sourcing of spare parts, production, and maintenance of a single-machine production system to maximise production revenue by meeting customer demands and minimising costs. In a case study, Proximal Policy Optimisation, which is well-known from OpenAI’s ChatGPT, is applied to a post-prognostics PPC decision-making problem. The Proximal Policy Optimisation is compared to other state-of-the-art learners, and the performance and robustness are evaluated. Analyses show that our model outperforms other learners, as well as reactive and scheduled preventive maintenance strategies and is robust to noise and cost changes.

Full Text