Optimizing Expectation with Guarantees in POMDPs

Krishnendu Chatterjee,Guillermo Pérez,Jean-François Raskin,Petr Novotný,Đorđe Žikelić

doi:10.1609/aaai.v31i1.11046

Abstract

A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy that maximizes the expected discounted-sum payoff. However, such policies may still permit unlikely but highly undesirable outcomes, which is problematic especially in safety-critical applications. Recently, there has been a surge of interest in POMDPs where the goal is to maximize the probability to ensure that the payoff is at least a given threshold, but these approaches do not consider any optimization beyond satisfying this threshold constraint. In this work we go beyond both the “expectation” and “threshold” approaches and consider a “guaranteed payoff optimization (GPO)” problem for POMDPs, where we are given a threshold t and the objective is to find a policy σ such that a) each possible outcome of σ yields a discounted-sum payoff of at least t, and b) the expected discounted-sum payoff of σ is optimal (or near-optimal) among all policies satisfying a). We present a practical approach to tackle the GPO problem and evaluate it on standard POMDP benchmarks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Optimizing Expectation with Guarantees in POMDPs

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Feb 12, 2017
Citations: 8

Similar Papers

Model-Based Performance Evaluation of Safety-Critical POMDPs
Parisa Pouya ... Azad M Madni
-
Parisa Pouya, et. al.Parisa Pouya ... Azad M Madni
12 Dec 2021
12 Dec 2021

A Bayesian game based adaptive fuzzy controller for multiagent POMDPs
Rajneesh Sharma ... Matthijs T J Spaan
-
Rajneesh Sharma, et. al.Rajneesh Sharma ... Matthijs T J Spaan
01 Jul 2010
01 Jul 2010

Tractable POMDP-planning for robots with complex non-linear dynamics
Marcus Hoerger
-
Marcus HoergerMarcus Hoerger
16 Mar 2020
16 Mar 2020

Partially observed Markov decision processes (POMDPs)
Vikram Krishnamurthy
-
Vikram KrishnamurthyVikram Krishnamurthy
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimizing Expectation with Guarantees in POMDPs

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence