Abstract

Condition-based maintenance strategies are effective in enhancing reliability and safety for complex engineering systems that exhibit degradation phenomena with uncertainty. Such sequential decision-making problems are often modeled as Markov decision processes (MDPs) when the underlying process has a Markov property. Recently, reinforcement learning (RL) becomes increasingly efficient to address MDP problems with large state spaces. In this paper, we model the condition-based maintenance problem as a discrete-time continuous-state MDP without discretizing the deterioration condition of the system. The Gaussian process regression is used as function approximation to model the state transition and the value functions of states in reinforcement learning. A RL algorithm is then developed to minimize the long-run average cost (instead of the commonly-used discounted reward) with iterations on the state-action value function and the state value function, respectively. We verify the capability of the proposed algorithm by simulation experiments and demonstrate its advantages in a case study on a battery maintenance decision-making problem. The proposed algorithm outperforms the discrete MDP approach by achieving lower long-run average costs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call