Fast-Tracking Stationary MOMDPs for Adaptive Management Problems

Martin Péron,Kai Becker,Iadine Chadès,Peter Bartlett

doi:10.1609/aaai.v31i1.11173

Abstract

Adaptive management is applied in conservation and natural resource management, and consists of making sequential decisions when the transition matrix is uncertain. Informally described as ’learning by doing’, this approach aims to trade off between decisions that help achieve the objective and decisions that will yield a better knowledge of the true transition matrix. When the true transition matrix is assumed to be an element of a finite set of possible matrices, solving a mixed observability Markov decision process (MOMDP) leads to an optimal trade-off but is very computationally demanding. Under the assumption (common in adaptive management) that the true transition matrix is stationary, we propose a polynomial-time algorithm to find a lower bound of the value function. In the corners of the domain of the value function (belief space), this lower bound is provably equal to the optimal value function. We also show that under further assumptions, it is a linear approximation of the optimal value function in a neighborhood around the corners. We evaluate the benefits of our approach by using it to initialize the solvers MO-SARSOP and Perseus on a novel computational sustainability problem and a recent adaptive management data challenge. Our approach leads to an improved initial value function and translates into significant computational gains for both solvers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Fast-Tracking Stationary MOMDPs for Adaptive Management Problems

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Feb 12, 2017
Citations: 5

Similar Papers

Policy Iteration Based on a Learned Transition Model
Vivek Ramavajjala ... Charles Elkan
-
Vivek Ramavajjala, et. al.Vivek Ramavajjala ... Charles Elkan
01 Jan 2012
01 Jan 2012

Adaptive value function approximations in classifier systems
Lashon B Booker
-
Lashon B BookerLashon B Booker
25 Jun 2005
25 Jun 2005

Path-wise estimators and cross-path regressions: an application to evaluating portfolio strategies
Martin B Haugh ... Ashish Jain
-
Martin B Haugh, et. al.Martin B Haugh ... Ashish Jain
01 Dec 2007
01 Dec 2007

Path-wise estimators and cross-path regressions: an application to evaluating portfolio strategies
...
-
, et. al. ...
09 Dec 2007
09 Dec 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast-Tracking Stationary MOMDPs for Adaptive Management Problems

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence