Abstract

This paper presents a novel reinforcement learning (RL) framework to design cascade feedback control policies for 3D bipedal locomotion. Existing RL algorithms are often trained in an end-to-end manner or rely on prior knowledge of some reference joint or task space trajectories. Unlike these studies, we propose a policy structure that decouples the bipedal locomotion problem into two modules that incorporate the physical insights from the nature of the walking dynamics and the well-established Hybrid Zero Dynamics approach for 3D bipedal walking. As a result, the overall RL framework has several key advantages, including lightweight network structure, sample efficiency, and less dependence on prior knowledge. The proposed solution learns stable and robust walking gaits from scratch and allows the controller to realize omnidirectional walking with accurate tracking of the desired velocity and heading angle. The learned policies also perform robustly against various adversarial forces applied to the torso and walking blindly on a series of challenging and unstructured terrains. These results demonstrate that the proposed cascade feedback control policy is suitable for navigation of 3D bipedal robots in indoor and outdoor environments.

Highlights

  • W HILE human and biological bipeds can naturally learn complex motion planning, it is still a challenging task for bipedal robots due to the highly unstable nature of bipedal robots

  • We extend the preliminary results to further increase the efficiency of the learning method, consider an additional degree of freedom to include constrained arm’s motion into the walking gait, include additional regulations to improve the performance of the controller, and perform an extensive series of indoors and outdoors experiments to demonstrate the good performance of the learned policy on hardware

  • While there are various ways to design the motion policy for bipedal locomotion in literature, our work focuses on reinforcement learning (RL) design approaches [6], [10], [11]

Read more

Summary

INTRODUCTION

W HILE human and biological bipeds can naturally learn complex motion planning, it is still a challenging task for bipedal robots due to the highly unstable nature of bipedal robots. We extend the preliminary results to further increase the efficiency of the learning method, consider an additional degree of freedom to include constrained arm’s motion into the walking gait, include additional regulations to improve the performance of the controller, and perform an extensive series of indoors and outdoors experiments to demonstrate the good performance of the learned policy on hardware. We conduct extensive experiments to test the performance of the policy on real hardware, demonstrating the learned policy has a good tracking performance on the desired waking velocity and the desired torso orientation These results enable the application of the proposed RL framework with confidence for terrain navigation in indoor and outdoor environments. Problem using a cascaded structure that combines the reinforcement learning (RL) based motion planning and modelbased feedback control design

CASCADED MOTION CONTROL FRAMEWORK
ACTION SPACE
FEEDBACK REGULATOR POLICY DESIGN
TORQUE REGULATIONS
ILLUSTRATION EXAMPLES
Method
FEEDBACK REGULATIONS
SIMULATION AND EXPERIMENTAL RESULTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call