Multi-objective dynamic programming with limited precision

L Mandow,N Pozas,J L Perez-De-La-Cruz

doi:10.1007/s10898-021-01096-x

L Mandow, N Pozas + Show 1 more

Open Access

https://doi.org/10.1007/s10898-021-01096-x

Copy DOI

Journal: Journal of Global Optimization	Publication Date: Nov 2, 2021
Citations: 4	License type: open-access

Affiliation: Universidad de Málaga

Abstract

This paper addresses the problem of approximating the set of all solutions for Multi-objective Markov Decision Processes. We show that in the vast majority of interesting cases, the number of solutions is exponential or even infinite. In order to overcome this difficulty we propose to approximate the set of all solutions by means of a limited precision approach based on White’s multi-objective value-iteration dynamic programming algorithm. We prove that the number of calculated solutions is tractable and show experimentally that the solutions obtained are a good approximation of the true Pareto front.

Highlights

Markov decision processes (MDPs) are a well-known conceptual tool useful for modelling the operation of systems as sequential decision processes
This paper analyzes some practical difficulties that arise in the solution of Multi-objective Markov decision processes (MOMDPs)
We show that the number of nondominated policy values is tractable only under a number of limiting assumptions

Summary

Introduction

Markov decision processes (MDPs) are a well-known conceptual tool useful for modelling the operation of systems as sequential decision processes. When it is possible to explicitly state the decision maker’s preferences prior to problem solving as a scalar function to be optimized, we are lead to single-policy approaches ( Perny and Weng [9], Wray et al [19]). With more general preference models where mixture policies are not acceptable (e.g. for ethical reasons, see Lizotte et al [7]), Pareto-optimal non-stationary policies need to be taken into consideration This case was theoretically solved by White [18], but the exact solution is not possible in practice due to the infeasible (or even infinite) size of the Pareto front. The solution to a MOMDP is given by the V(s) sets of all states

Combinatorial explosion

Recursive backwards algorithm

Vector value iteration algorithm

Vector value iteration with limited precision

Comparing Pareto front approximations

Stochastic deep sea treasure

Findings

Conclusions and future work

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-objective dynamic programming with limited precision

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Global Optimization

Lead the way for us

Similar Papers

Multiple objective (fuzzy) dynamic programming problems: a survey and some applications
Mahmoud A Abo-Sinna
Applied Mathematics and Computation | VOL. 157
Mahmoud A Abo-SinnaMahmoud A Abo-Sinna
03 Dec 2003
Applied Mathematics and Computation | VOL. 157

Multiple objectives and non-separability in stochastic dynamic programming
Duan Li
International Journal of Systems Science | VOL. 21
Duan LiDuan Li
01 May 1990
International Journal of Systems Science | VOL. 21

Water intervention plan
Jianeng Chai
-
Jianeng ChaiJianeng Chai
01 Jan 2015
01 Jan 2015

Model-based multiobjective fuzzy control using a new multiobjective dynamic programming approach
Dong-Oh Kang ... Zeungnam Bien
-
Dong-Oh Kang, et. al. Dong-Oh Kang ... Zeungnam Bien
25 Jul 2001
25 Jul 2001

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-objective dynamic programming with limited precision

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Global Optimization