Abstract

In high stakes situations decision-makers are often risk-averse and decision-making processes often take place in group settings. This paper studies multiagent decision-theoretic planning under Markov decision processes (MDPs) framework with considering the change of agent’s risk attitude as his wealth level varies. Based on one-switch utility function that describes agent’s risk attitude change with his wealth level, we give the additive and multiplicative aggregation models of group utility and adopt maximizing expected group utility as planning objective. When the wealth level approaches infinity, the characteristics of optimal policy are analyzed for the additive and multiplicative aggregation model, respectively. Then a backward-induction method is proposed to divide the wealth level interval from negative infinity to initial wealth level into subintervals and determine the optimal policy in states and subintervals. The proposed method is illustrated by numerical examples and the influences of agent’s risk aversion parameters and weights on group decision-making are also analyzed.

Highlights

  • Decision-theoretic planning is to compute optimal policy that is formed by courses of action to maximize expected reward with considering actions that have uncertain outcomes [1]

  • Considering agent’s risk-sensitive attitude and wealth level, this paper studies the risk-sensitive multiagent decision-theoretic planning problem based on one-switch utility function and Markov decision processes (MDPs) framework

  • For the MDP model augmented with wealth level, the optimal policy maps every combination of a state s ∈ S󸀠 and wealth level w to an action a ∈ As that an agent in state s with wealth level w should execute to maximize expected utility

Read more

Summary

Introduction

Decision-theoretic planning is to compute optimal policy that is formed by courses of action to maximize expected reward with considering actions that have uncertain outcomes [1]. Mathematical Problems in Engineering study Markov decision processes with considering agent’s risk-sensitive attitude modeled by one-switch utility function and propose an exact backward-induction algorithm to compute optimal policy [8]. Considering agent’s risk-sensitive attitude and wealth level, this paper studies the risk-sensitive multiagent decision-theoretic planning problem based on one-switch utility function and MDP framework. Backward-induction algorithms for these two kinds of group utility functions to compute optimal policy of risk-sensitive group decision-making under MDP framework are proposed.

Risk-Sensitive MDP Model Augmented with Wealth Level
Risk-Sensitive MDP Model Augmented with Wealth
Utility Aggregation Model of One-Switch Utility Functions
Preparation for Backward-Induction Method
Division of Wealth Level Interval and Backward-Induction Method
Numerical Examples
Conclusion and Future Works
Proof of Theorem 3
Proof of Theorem 4
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call