Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making

Qi Zhang,Satinder Singh,Edmund Durfee

doi:10.1609/icaps.v27i1.13836

Abstract

In cooperative multiagent planning, it can often be beneficial for an agent to make commitments about aspects of its behavior to others, allowing them in turn to plan their own behaviors without taking the agent's detailed behavior into account. Extending previous work in the Bayesian setting, we consider instead a worst-case setting in which the agent has a set of possible environments (MDPs) it could be in, and develop a commitment semantics that allows for probabilistic guarantees on the agent's behavior in any of the environments it could end up facing. Crucially, an agent receives observations (of reward and state transitions) that allow it to potentially eliminate possible environments and thus obtain higher utility by adapting its policy to the history of observations. We develop algorithms and provide theory and some preliminary empirical results showing that they ensure an agent meets its commitments with history-dependent policies while minimizing maximum regret over the possible environments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making

Abstract

Talk to us

Similar Papers

More From: Proceedings of the International Conference on Automated Planning and Scheduling

Lead the way for us

Journal: Proceedings of the International Conference on Automated Planning and Scheduling	Publication Date: Jun 5, 2017
Citations: 3

Similar Papers

Repast Simphony Statecharts
Jonathan Ozik ... Michael North
Journal of Artificial Societies and Social Simulation | VOL. 18
Jonathan Ozik, et. al.Jonathan Ozik ... Michael North
01 Jan 2015
Journal of Artificial Societies and Social Simulation | VOL. 18

Bayesian Mechanism Design with Efficiency, Privacy, and Approximate Truthfulness
Samantha Leung ... Edward Lui
-
Samantha Leung, et. al.Samantha Leung ... Edward Lui
01 Jan 2012
01 Jan 2012

Temporal-Spatial Causal Interpretations for Vision-Based Reinforcement Learning.
Wenjie Shi ... Cheng Wu
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 44
Wenjie Shi, et. al.Wenjie Shi ... Cheng Wu
01 Dec 2022
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 44

A dynamic model of multi-agent system
Shan-Li Hu ... Chun-Yi Shi
-
Shan-Li Hu, et. al. Shan-Li Hu ... Chun-Yi Shi
26 Aug 2004
26 Aug 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making

Abstract

Talk to us

Similar Papers

More From: Proceedings of the International Conference on Automated Planning and Scheduling