Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics.

Zhicheng Wang,Jun Wu,Qiuguo Zhu,Anhuan Xie,Wandi Wei,Yifeng Zhang

doi:10.3390/mi13101688

Zhicheng Wang, Jun Wu + Show 4 more

Open Access

https://doi.org/10.3390/mi13101688

Copy DOI

Abstract

Locomotion control has long been vital to legged robots. Agile locomotion can be implemented through either model-based controller or reinforcement learning. It is proven that robust controllers can be obtained through model-based methods and learning-based policies have advantages in generalization. This paper proposed a hybrid framework of locomotion controller that combines deep reinforcement learning and simple heuristic policy and assigns them to different activation phases, which provides guidance for adaptive training without producing conflicts between heuristic knowledge and learned policies. The training in simulation follows a step-by-step stochastic curriculum to guarantee success. Domain randomization during training and assistive extra feedback loops on real robot are also adopted to smooth the transition to the real world. Comparison experiments are carried out on both simulated and real Wukong-IV humanoid robots, and the proposed hybrid approach matches the canonical end-to-end approaches with higher rate of success, faster converging speed, and 60% less tracking error in velocity tracking tasks.

Full Text