Finite-Horizon and Infinite-Horizon Markov Decision Processes with Trapezoidal Fuzzy Discounted Rewards

Karla Carrero-Vera,Raúl Montes-De-Oca,Hugo Cruz-Suárez

doi:10.1007/978-3-031-10725-2_9

Abstract

AbstractDiscrete-time discounted Markov decision processes (MDPs, in singular MDP) with finite state spaces, compact action sets and trapezoidal fuzzy reward functions are presented in this article. For such a kind of MDPs, both the finite and the infinite horizons cases are studied. The corresponding optimal control problems are established with respect to the partial order on the \(\alpha \)-cuts of fuzzy numbers, named the fuzzy max order. The fuzzy optimal solution is related to a suitable discounted MDP with a nonfuzzy reward. And in the article, different applications of the theory developed are provided: a finite-horizon model of an inventory system in which an algorithm to calculate the optimal solution is given, and, additionally for the infinite-horizon case, an MDP and a competitive MDP (also known as a stochastic game) are supplied in an economic and financial context.KeywordsDiscounted Markov decision processOptimal policyFuzzy setTrapezoidal fuzzy numberFuzzy reward

Full Text