Abstract

Maintenance policy optimization usually is faced with challenges that arise from an imperfect knowledge of system degradation models and from the partial observability of system degradation states. This paper proposes a reinforcement learning method to address these two challenges for a class of maintenance problems with Markov degradation processes. The reinforcement learning approach consists of a learning component and a planning component. Using sequentially collected observations, at each step of decision-making the learning component improves the knowledge of system degradation in terms of the probability distributions of the transition rates based on sequential Bayesian inference. Using the updated transition rates, at each step of decision-making the maintenance policy optimization problem is then formulated as a partially observable Markov decision problem, and the planning component computes the optimal maintenance policy that maximizes the expected cumulative reward. The proposed method is illustrated using a numerical example with repair and inspection maintenance actions. The result shows that as more observations are collected, the learning component progressively learns the true system degradation process, and the planning component adjusts the optimal maintenance policy accordingly as well, which leads to increased reward.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call