Stochastic enforced hill-climbing

Jianhua Wu ,Rajesh Kalyanam ,Robert Givan

doi:10.1613/jair.3420

Abstract

Enforced hill-climbing is an effective deterministic hill-climbing technique that deals with local optima using breadth-first search (a process called flooding). We propose and evaluate a stochastic generalization of enforced hill-climbing for online use in goal-oriented probabilistic planning problems. We assume a provided heuristic function estimating expected cost to the goal with flaws such as local optima and plateaus that thwart straightforward greedy action choice. While breadth-first search is effective in exploring basins around local optima in deterministic problems, for stochastic problems we dynamically build and solve a heuristic-based Markov decision process (MDP) model of the basin in order to find a good escape policy exiting the local optimum. We note that building this model involves integrating the heuristic into the MDP problem because the local goal is to improve the heuristic. We evaluate our proposal in twenty-four recent probabilistic planning-competition benchmark domains and twelve probabilistically interesting problems from recent literature. For evaluation, we show that stochastic enforced hill-climbing (SEH) produces better policies than greedy heuristic following for value/cost functions derived in two very different ways: one type derived by using deterministic heuristics on a deterministic relaxation and a second type derived by automatic learning of Bellman-error features from domain-specific experience. Using the first type of heuristic, SEH is shown to generally outperform all planners from the first three international probabilistic planning competitions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Stochastic enforced hill-climbing

Abstract

Talk to us

Similar Papers

More From: Journal of Artificial Intelligence Research

Lead the way for us

Journal: Journal of Artificial Intelligence Research	Publication Date: Sep 1, 2011
Citations: 26

Similar Papers

Fluctuation Reduction of Wind Power and Sizing of Battery Energy Storage Systems in Microgrids
Zhen Yang ... Li Xia
IEEE Transactions on Automation Science and Engineering | VOL. 17
Zhen Yang, et. al.Zhen Yang ... Li Xia
01 Jan 2020
IEEE Transactions on Automation Science and Engineering | VOL. 17

Conversion of MDP problems into heuristics based planning problems using temporal decomposition
Rida Gillani ... Ali Nasir
-
Rida Gillani, et. al.Rida Gillani ... Ali Nasir
01 Jan 2015
01 Jan 2015

Dynamic target tracking based on corner enhancement with Markov decision process
Guoyu Zuo ... Lei Ma
The Journal of Engineering | VOL. 2018
Guoyu Zuo, et. al.Guoyu Zuo ... Lei Ma
17 Sep 2018
The Journal of Engineering | VOL. 2018

Distributed Service Migration in Satellite Mobile Edge Computing
Zhen Li ... Chunxiao Jiang
-
Zhen Li, et. al.Zhen Li ... Chunxiao Jiang
01 Dec 2021
01 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stochastic enforced hill-climbing

Abstract

Talk to us

Similar Papers

More From: Journal of Artificial Intelligence Research