Policy-Guided Heuristic Search with Guarantees

Laurent Orseau,Levi H S Lelis

doi:10.1609/aaai.v35i14.17469

Abstract

The use of a policy and a heuristic function for guiding search can be quite effective in adversarial problems, as demonstrated by AlphaGo and its successors, which are based on the PUCT search algorithm. While PUCT can also be used to solve single-agent deterministic problems, it lacks guarantees on its search effort and it can be computationally inefficient in practice. Combining the A* algorithm with a learned heuristic function tends to work better in these domains, but A* and its variants do not use a policy. Moreover, the purpose of using A* is to find solutions of minimum cost, while we seek instead to minimize the search loss (e.g., the number of search steps). LevinTS is guided by a policy and provides guarantees on the number of search steps that relate to the quality of the policy, but it does not make use of a heuristic function. In this work we introduce Policy-guided Heuristic Search (PHS), a novel search algorithm that uses both a heuristic function and a policy and has theoretical guarantees on the search loss that relates to both the quality of the heuristic and of the policy. We show empirically on the sliding-tile puzzle, Sokoban, and a puzzle from the commercial game `The Witness' that PHS enables the rapid learning of both a policy and a heuristic function and compares favorably with A*, Weighted A*, Greedy Best-First Search, LevinTS, and PUCT in terms of number of problems solved and search time in all three domains tested.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Policy-Guided Heuristic Search with Guarantees

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: May 18, 2021
Citations: 1

Similar Papers

The Fast Downward Planning System
M Helmert
Journal of Artificial Intelligence Research | VOL. 26
M HelmertM Helmert
12 Jul 2006
Journal of Artificial Intelligence Research | VOL. 26

A Novel Technique for Avoiding Plateaus of Greedy Best-First Search in Satisficing Planning
Tatsuya Imai ... Akihiro Kishimoto
Proceedings of the International Symposium on Combinatorial Search | VOL. 2
Tatsuya Imai, et. al.Tatsuya Imai ... Akihiro Kishimoto
19 Aug 2021
Proceedings of the International Symposium on Combinatorial Search | VOL. 2

Evolutionary Heuristic A* Search: Heuristic Function Optimization via Genetic Algorithm
Ying Fung Yiu ... Jing Du
-
Ying Fung Yiu, et. al.Ying Fung Yiu ... Jing Du
01 Sep 2018
01 Sep 2018

Solving Large Problems with Heuristic Search: General-Purpose Parallel External-Memory Search
Matthew Hatem ... Ethan Burns
Journal of Artificial Intelligence Research | VOL. 62
Matthew Hatem, et. al.Matthew Hatem ... Ethan Burns
08 Jun 2018
Journal of Artificial Intelligence Research | VOL. 62

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Policy-Guided Heuristic Search with Guarantees

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence