Exploratory Combinatorial Optimization with Reinforcement Learning

Thomas Barrett,Jakob Foerster,Alex Lvovsky,William Clements

doi:10.1609/aaai.v34i04.5723

Abstract

Many real-world problems can be reduced to combinatorial optimization on a graph, where the subset or ordering of vertices that maximize some objective function must be found. With such tasks often NP-hard and analytically intractable, reinforcement learning (RL) has shown promise as a framework with which efficient heuristic methods to tackle these problems can be learned. Previous works construct the solution subset incrementally, adding one element at a time, however, the irreversible nature of this approach prevents the agent from revising its earlier decisions, which may be necessary given the complexity of the optimization task. We instead propose that the agent should seek to continuously improve the solution by learning to explore at test time. Our approach of exploratory combinatorial optimization (ECO-DQN) is, in principle, applicable to any combinatorial problem that can be defined on a graph. Experimentally, we show our method to produce state-of-the-art RL performance on the Maximum Cut problem. Moreover, because ECO-DQN can start from any arbitrary configuration, it can be combined with other search methods to further improve performance, which we demonstrate using a simple random search.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Exploratory Combinatorial Optimization with Reinforcement Learning

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Apr 3, 2020
Citations: 77

Similar Papers

Reinforcement Learning Approach for Inspect/Correct Tasks
Hoda Nasereddin
-
Hoda NasereddinHoda Nasereddin
01 Jan 2020
01 Jan 2020

Routine behaviour, a putative dopamine marker, predicts cognitive flexibility by tDCS of the dlPFC
Gibson Yatan ... Luca Aquili
Brain Stimulation | VOL. 16
Gibson Yatan, et. al.Gibson Yatan ... Luca Aquili
01 Jan 2023
Brain Stimulation | VOL. 16

Flood mitigation in coastal urban catchments using real-time stormwater infrastructure control and reinforcement learning
Benjamin D Bowes ... Arash Tavakoli
Journal of hydroinformatics | VOL. 23
Benjamin D Bowes, et. al.Benjamin D Bowes ... Arash Tavakoli
20 Oct 2020
Journal of hydroinformatics | VOL. 23

강화학습 기반 화학 공정 제어 성능 향상을 위한 보상 함수 시뮬레이션 연구
Joonsoo Park ... Jae-Hyun Shim
Journal of Institute of Control, Robotics and Systems | VOL. 28
Joonsoo Park, et. al.Joonsoo Park ... Jae-Hyun Shim
31 Dec 2022
Journal of Institute of Control, Robotics and Systems | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploratory Combinatorial Optimization with Reinforcement Learning

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence