Goal-driven active learning

Nicolas Bougie,Ryutaro Ichise

doi:10.1007/s10458-021-09527-5

Abstract

Deep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.

Highlights

Recent successes in deep reinforcement learning (DRL) have been achieved in domains with a well-specified reward function such as in game-playing [53] or robot control [49]
Very small amounts of queries let us outperform prior imitationbased approaches on Fetch and ShadowHand tasks. These games involve sparse and delayed rewards. These results suggest that Goal-driven Active Learning (GoAL) can greatly benefit in exploration efficiency and could help to expand the possible applications of RL
In this paper we presented Goal-driven Active Learning (GoAL), a method introducing interactive goal-driven demonstrations to both learn more effectively and efficiently

Summary

Introduction

Recent successes in deep reinforcement learning (DRL) have been achieved in domains with a well-specified reward function such as in game-playing [53] or robot control [49]. A line of work for overcoming the above-mentioned issues is goal-conditioned learning, a form of self-supervision that constructs a goal-conditioned policy to learn how to reach multiple goals [44, 68]. This idea was extended in Hindsight Experience Replay (HER) [4] to artificially generate new transitions by relabeling goals seen along the state trajectory. It may still require a large amount of data to capture complex policies. It drives the agent to learn how to achieve multiple goals without simulating interactions—generating and recomputing rewards of a single transition can be converted into many valid training examples

Objectives

Methods

Findings

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Goal-driven active learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Autonomous Agents and Multi-Agent Systems

Lead the way for us

Journal: Autonomous Agents and Multi-Agent Systems	Publication Date: Aug 16, 2021
License type: open-access

Similar Papers

Goal-driven Active Reinforcement Learning with Human Teachers
...
-
, et. al. ...
20 Apr 2022
20 Apr 2022

Goal-driven Active Reinforcement Learning with Human Teachers
...
-
, et. al. ...
20 Apr 2022
20 Apr 2022

Composable Deep Reinforcement Learning for Robotic Manipulation
Tuomas Haarnoja ... Sergey Levine
-
Tuomas Haarnoja, et. al.Tuomas Haarnoja ... Sergey Levine
01 May 2018
01 May 2018

Exploration via Progress-Driven Intrinsic Rewards
Nicolas Bougie ... Ryutaro Ichise
-
Nicolas Bougie, et. al.Nicolas Bougie ... Ryutaro Ichise
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Goal-driven active learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Autonomous Agents and Multi-Agent Systems