An anytime algorithm for constrained stochastic shortest path problems with deterministic policies

Sungkweon Hong,Brian C Williams

doi:10.1016/j.artint.2022.103846

Abstract

Sequential decision-making problems arise in every arena of daily life and pose unique challenges for research in decision-theoretic planning. Although there has been a wide variety of research in this field, most of the studies have largely focused on single objective problem without constraints. In many real-world applications, however, it is often desirable to bound certain costs or resources under some predefined level. Constrained stochastic shortest path problem (C-SSP), one of the most well-known mathematical frameworks for stochastic decision-making problems with constraints, can formally model such problems, by incorporating constraints in the model formulation. However, it remains an open challenge to produce a deterministic optimal policy with desirable computation time due to its intrinsic complexity.In this paper, we propose a method that produces an optimal and deterministic policy for a C-SSP based on the Lagrangian duality theory and the heuristic forward search method. To address the intrinsic complexity of C-SSP, the proposed method is designed to have an anytime property. In other words, the proposed algorithm tries to find a feasible but decent solution quickly, then improves the solution incrementally until it converges to a true optimal solution. An extensive experimental evaluation on three problem domains shows that the proposed method outperforms the state-of-the-art methods in terms of the near-optimal solution with an optimality gap of less than 0.1%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An anytime algorithm for constrained stochastic shortest path problems with deterministic policies

Abstract

Talk to us

Similar Papers

More From: Artificial Intelligence

Lead the way for us

Journal: Artificial Intelligence	Publication Date: Dec 30, 2022
Citations: 1

Similar Papers

Deterministic policies based on maximum regrets in MDPs with imprecise rewards
Pegah Alizadeh ... Aomar Osmani
AI Communications | VOL. 34
Pegah Alizadeh, et. al.Pegah Alizadeh ... Aomar Osmani
18 Mar 2022
AI Communications | VOL. 34

Delay-Optimal Dynamic Mode Selection and Resource Allocation in Device-to-Device Communications—Part I: Optimal Policy
Lei Lei ... Chuang Lin
IEEE Transactions on Vehicular Technology | VOL. 65
Lei Lei, et. al.Lei Lei ... Chuang Lin
01 May 2016
IEEE Transactions on Vehicular Technology | VOL. 65

Fair Resource Allocation with QoS Guarantee in Secure Multiuser TDMA Networks
Zhiquan Bai ... Piming Ma
Wireless Communications and Mobile Computing | VOL. 2018
Zhiquan Bai, et. al.Zhiquan Bai ... Piming Ma
01 Jan 2018
Wireless Communications and Mobile Computing | VOL. 2018

Decentralized Deterministic Multi-Agent Reinforcement Learning
Antoine Grosnit ... Laura Wynter
-
Antoine Grosnit, et. al.Antoine Grosnit ... Laura Wynter
14 Dec 2021
14 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An anytime algorithm for constrained stochastic shortest path problems with deterministic policies

Abstract

Talk to us

Similar Papers

More From: Artificial Intelligence