Stage-Wise Learning of Reaching Using Little Prior Knowledge.

François De La Bourdonnaye,Jochen Triesch,Céline Teulière,Thierry Chateau

doi:10.3389/frobt.2018.00110

Abstract

In some manipulation robotics environments, because of the difficulty of precisely modeling dynamics and computing features which describe well the variety of scene appearances, hand-programming a robot behavior is often intractable. Deep reinforcement learning methods partially alleviate this problem in that they can dispense with hand-crafted features for the state representation and do not need pre-computed dynamics. However, they often use prior information in the task definition in the form of shaping rewards which guide the robot toward goal state areas but require engineering or human supervision and can lead to sub-optimal behavior. In this work we consider a complex robot reaching task with a large range of initial object positions and initial arm positions and propose a new learning approach with minimal supervision. Inspired by developmental robotics, our method consists of a weakly-supervised stage-wise procedure of three tasks. First, the robot learns to fixate the object with a 2-camera system. Second, it learns hand-eye coordination by learning to fixate its end-effector. Third, using the knowledge acquired in the previous steps, it learns to reach the object at different positions and from a large set of initial robot joint angles. Experiments in a simulated environment show that our stage-wise framework yields similar reaching performances, compared with a supervised setting without using kinematic models, hand-crafted features, calibration parameters or supervised visual modules.

Highlights

In manipulation robotics, various tasks cannot be programmed by hand because dynamics is hard to compute or/and hand-crafted features do not describe well enough the variety of scene appearances
Our experiments show that learning this task with our weakly-supervised stagewise framework yields same reaching performances as with a supervised reward, while learning with a sparse reward is slow
Most Reinforcement LearningReinforcement learning (RL) algorithms are based on Markov decision processes < S, A, R, T > where S is the set of states, A the set of actions, T the transition model (T : S × A → S) and R the reward function (R : S × A → R)

Summary

INTRODUCTION

Various tasks cannot be programmed by hand because dynamics is hard to compute or/and hand-crafted features do not describe well enough the variety of scene appearances. In Levine et al (2015, 2016) (for tasks such as placing wooden rings or screwing bottle caps onto bottles), informative shaping rewards have been computed using a distance measure between current end-effector or manipulated object positions and their corresponding target positions They require knowledge of kinematics or non-trivial visual modules. The robot first learns to fixate objects (de La Bourdonnaye et al, 2017) and its own endeffector using a single deep reinforcement learning framework with little prior knowledge in the goal specification Based on these two skills, an informative shaping reward is built, efficiently guiding the robot toward goal state areas.

Reinforcement Learning

Overview

Task Overview

Object Detection

Learning to Touch

Reward Computation

Experiments

RESULTS

Contributions

Related Work

Limitations

FURTHER RESEARCH

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in robotics and AI	Publication Date: Oct 1, 2018
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Stage-Wise Learning of Reaching Using Little Prior Knowledge.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in robotics and AI

Lead the way for us

Similar Papers

Sensitivity analysis of prior model probabilities and the value of prior knowledge in the assessment of conceptual model uncertainty in groundwater modelling
Rodrigo Rojas ... Luc Feyen
Hydrological Processes | VOL. 23
Rodrigo Rojas, et. al.Rodrigo Rojas ... Luc Feyen
14 Jan 2009
Hydrological Processes | VOL. 23

Cognitive Precursors to Science Comprehension
Kimberly G Cottrell ... Danielle S Mcnamara
-
Kimberly G Cottrell, et. al.Kimberly G Cottrell ... Danielle S Mcnamara
24 Apr 2019
24 Apr 2019

Range-based underwater target localization using an autonomous surface vehicle: Observability analysis
N Crasta ... Behzad Bayat
-
N Crasta, et. al.N Crasta ... Behzad Bayat
01 Apr 2018
01 Apr 2018

AGSF: Adaptive Graph Formulation and Hand-Crafted Graph Spectral Features for Shape Representation
Basheer Alwaely ... Charith Abhayaratne
IEEE Access | VOL. 8
Basheer Alwaely, et. al.Basheer Alwaely ... Charith Abhayaratne
01 Jan 2020
IEEE Access | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stage-Wise Learning of Reaching Using Little Prior Knowledge.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in robotics and AI