On the Emergence of Whole-Body Strategies From Humanoid Robot Push-Recovery Learning

Diego Ferigo,Raffaello Camoriano,Paolo Maria Viceconte,Daniele Pucci,Lorenzo Rosasco,Daniele Calandriello,Silvio Traversaro

doi:10.1109/lra.2021.3076955

Abstract

Balancing and push-recovery are essential capabilities enabling humanoid robots to solve complex locomotion tasks. In this context, classical control systems tend to be based on simplified physical models and hard-coded strategies. Although successful in specific scenarios, this approach requires demanding tuning of parameters and switching logic between specifically-designed controllers for handling more general perturbations. We apply model-free Deep Reinforcement Learning for training a general and robust humanoid push-recovery policy in a simulation environment. Our method targets high-dimensional whole-body humanoid control and is validated on the iCub humanoid. Reward components incorporating expert knowledge on humanoid control enable fast learning of several robust behaviors by the same policy, spanning the entire body. We validate our method with extensive quantitative analyses in simulation, including out-of-sample tasks which demonstrate policy robustness and generalization, both key requirements towards real-world robot deployment.

Highlights

B IPEDS are those creatures that make use of two legs for moving while maintaining static or dynamic equilibrium
These methods are more targeted towards benchmarking model-free Deep Reinforcement Learning (DRL) for continuous control and realistic animation of simplified characters rather than applicability to real humanoid robots
DRL-based methods for whole-body humanoid control remain an open problem and have the potential for learning highdimensional locomotion policies, further improving humanoid capabilities to recover from external perturbations

Summary

INTRODUCTION

B IPEDS are those creatures that make use of two legs for moving while maintaining static or dynamic equilibrium. Learned behaviors often display unnatural characteristics, such as asymmetric gaits, abrupt motions of the body and limbs, or even unrealistic motions exploiting imperfections and glitches in the physical simulator of choice These issues significantly limit generalization and transferability to real-world robots. Control architectures are often organized as hierarchies composed of trajectory optimization [8], simplified model control, and whole-body quadratic programming [9], [10] While such approaches have achieved considerable results both on simulated and real humanoid robots, they: 1) Rely on an accurate description of the robot dynamics; 2) Require hand-crafted features for online execution [11]; 3) Present challenges when simultaneously facing different tasks. Inspired by floating-base dynamics – encoding sufficient information for solving the task with no prior knowledge about the desired trajectories

Control-Theoretic Approaches

Deep Reinforcement Learning Approaches

BACKGROUND

ENVIRONMENT

Action

Reward

Other Specifications

Deterministic Planar Forces

Random Spherical Forces on the Base Links

Training Performance

Random Spherical Forces on the Chest and Elbow Links

DISCUSSION

VIII. CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Robotics and Automation Letters	Publication Date: Apr 29, 2021
Citations: 44	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

On the Emergence of Whole-Body Strategies From Humanoid Robot Push-Recovery Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Robotics and Automation Letters

Lead the way for us

Similar Papers

Policy Learning for Robust Markov Decision Process with a Mismatched Generative Model
Jialian Li ... Dong Yan
Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence | VOL. 36
Jialian Li, et. al.Jialian Li ... Dong Yan
28 Jun 2022
Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence | VOL. 36

Deep Reinforcement Learning toward Robust Multi-echelon Supply Chain Inventory Optimization
Ibrahim El Shar ... Chetan Gupta
-
Ibrahim El Shar, et. al.Ibrahim El Shar ... Chetan Gupta
20 Aug 2022
20 Aug 2022

Learning Natural Locomotion Behaviors for Humanoid Robots Using Human Bias
Chuanyu Yang ... Kai Yuan
IEEE Robotics and Automation Letters | VOL. 5
Chuanyu Yang, et. al.Chuanyu Yang ... Kai Yuan
01 Apr 2020
IEEE Robotics and Automation Letters | VOL. 5

Moving Target Shooting Control Policy Based on Deep Reinforcement Learning
Boyu Li ... Dongbin Zhao
-
Boyu Li, et. al.Boyu Li ... Dongbin Zhao
10 Dec 2021
10 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the Emergence of Whole-Body Strategies From Humanoid Robot Push-Recovery Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Robotics and Automation Letters