Multi-model Markov decision processes

Lauren N Steimle,David L Kaufman,Brian T Denton

doi:10.1080/24725854.2021.1895454

Abstract

Markov decision processes (MDPs) have found success in many application areas that involve sequential decision making under uncertainty, including the evaluation and design of treatment and screening protocols for medical decision making. However, the data used to parameterize the model can influence what policies are recommended, and multiple competing data sources are common in many application areas, including medicine. In this article, we introduce the Multi-model Markov decision process (MMDP) which generalizes a standard MDP by allowing for multiple models of the rewards and transition probabilities. Solution of the MMDP generates a single policy that maximizes the weighted performance over all models. This approach allows the decision maker to explicitly trade-off conflicting sources of data while generating a policy of the same level of complexity for models that only consider a single source of data. We study the structural properties of this problem and show that it is at least NP-hard. We develop exact methods and fast approximation methods supported by error bounds. Finally, we illustrate the effectiveness and the scalability of our approach using a case study in preventative blood pressure and cholesterol management that accounts for conflicting published cardiovascular risk models.

Full Text