We develop in this paper a partially observable Markov decision process (POMDP) model for a maintenance planning problem and solve it with an efficient point-based value iteration (PBVI) algorithm. We consider a single-unit system, subjected to random degradation and failures, and for which the current degradation state can be partially observed via an imperfect monitoring system. The system state space is finite, and we model the following maintenance operations: i) perfect inspection, ii) preventive maintenance and iii) corrective maintenance. The goal is to optimize the maintenance policy by taking into account the imperfect monitoring data in order to minimize the expected discounted maintenance cost over an infinite time horizon. We formulate the problem as a POMDP where, at each time step, it should be decided whether or not to conduct a maintenance operation, and if so, which one. To keep the model general and flexible, we suppose that monitoring data are collected every K time steps (i.e. one observation epoch). The model is completed by a constraint imposing that only one maintenance operation can be conducted per observation epoch. Eventually, we solve it using a PBVI algorithm. The value function is approximated by interpolation of grid data points, and new relevant points are dynamically added into the grid where they most improve the value function. This approach is compared to a POMDP modeling based on approximate sample paths (ASP); when evaluated in different cost scenarios, the proposed approach systematically finds better maintenance policies for a comparable computation time. The computation of a lower bound finally proves that we are able to get the optimal value of the problem with satisfying precision.
Read full abstract