Faster MIL-based Subgoal Identification for Reinforcement Learning by Tuning Fewer Hyperparameters

Saim Sunel,Erkin Çilden,Faruk Polat

doi:10.1145/3643852

Abstract

Various methods have been proposed in the literature for identifying subgoals in discrete reinforcement learning (RL) tasks. Once subgoals are discovered, task decomposition methods can be employed to improve the learning performance of agents. In this study, we classify prominent subgoal identification methods for discrete RL tasks in the literature into the following three categories: graph-based, statistics-based, and multi-instance learning (MIL)-based. As contributions, firstly, we introduce a new MIL-based subgoal identification algorithm called EMDD-RL and experimentally compare it with a previous MIL-based method. The previous approach adapts MIL’s Diverse Density (DD) algorithm, whereas our method considers Expected-Maximization Diverse Density (EMDD). The advantage of EMDD over DD is that it can yield more accurate results with less computation demand thanks to the expectation-maximization algorithm. EMDD-RL modifies some of the algorithmic steps of EMDD to identify subgoals in discrete RL problems. Secondly, we evaluate the methods in several RL tasks for the hyperparameter tuning overhead they incur. Thirdly, we propose a new RL problem called key-room and compare the methods for their subgoal identification performances in this new task. Experiment results show that MIL-based subgoal identification methods could be preferred to the algorithms of the other two categories in practice.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Faster MIL-based Subgoal Identification for Reinforcement Learning by Tuning Fewer Hyperparameters

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Autonomous and Adaptive Systems

Lead the way for us

Similar Papers

Evolving large-scale neural networks for vision-based reinforcement learning
Jan Koutník ... Faustino Gomez
-
Jan Koutník, et. al.Jan Koutník ... Faustino Gomez
06 Jul 2013
06 Jul 2013

Aberrant Striatal Value Representation in Huntington's Disease Gene Carriers 25 Years Before Onset
Akshay Nair ... Sarah J Tabrizi
Biological Psychiatry: Cognitive Neuroscience and Neuroimaging | VOL. 6
Akshay Nair, et. al.Akshay Nair ... Sarah J Tabrizi
11 Jan 2021
Biological Psychiatry: Cognitive Neuroscience and Neuroimaging | VOL. 6

Scalable Neuroevolution for Reinforcement Learning
Faustino Gomez
-
Faustino GomezFaustino Gomez
01 Jan 2012
01 Jan 2012

Choice-selective sequences dominate in cortical relative to thalamic inputs to NAc to support reinforcement learning.
Nathan F Parker ... Laura M Haetzel
Cell Reports | VOL. 39
Nathan F Parker, et. al.Nathan F Parker ... Laura M Haetzel
01 May 2022
Cell Reports | VOL. 39

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Faster MIL-based Subgoal Identification for Reinforcement Learning by Tuning Fewer Hyperparameters

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Autonomous and Adaptive Systems