Control and Simulation of a 6-DOF Biped Robot based on Twin Delayed Deep Deterministic Policy Gradient Algorithm

Phan Bui Khoi,Nguyen Truong Giang,Hoang Van Tan

doi:10.17485/ijst/v14i30.1030

Phan Bui Khoi, Nguyen Truong Giang + Show 1 more

Open Access

https://doi.org/10.17485/ijst/v14i30.1030

Copy DOI

Journal: Indian Journal of Science and Technology	Publication Date: Jul 14, 2021
Citations: 4	License type: cc-by

Abstract

Objectives: To study an algorithm to control a bipedal robot to walk so that it has a gait close to that of a human. It is known that the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm is a highly efficient algorithm with a few changes compared to the popular algorithm — the commonly used Deep Deterministic Policy Gradient (DDPG) in the continuous action space problem in Reinforcement Learning. Methods: Different from the usual sparse reward function model used, in this study, a reward model combined with a sparse reward function and dense reward function will be proposed. The application of the TD3 algorithm together with the proposed reward function model to control a bipedal robot model with 6 degrees of freedom will be presented. The training process is simulated in Gazebo/Robot Operating System (ROS) environment. Finding: The results show that, when choosing a reward model combined with a sparse reward function and a dense reward function suitable for the robot model, will help it learn faster and achieve better results. The biped robot can walk straight with an almost human-like gait. In the paper, the results from the TD3 algorithm combined with the proposed reward model are also compared with the results from other algorithms. Novelty: Applying the TD3 algorithm combined with the proposed reward model for the 6-DOF biped robot and simulating the robot’s gait in Gazebo/ROS environment, ROS is a middleware that can be used to control a robot in a real environment in the future. Keywords: TD3; biped robot; reinforcement learning; ROS; Gazebo

Highlights

In the development of digital technology, the digital age, the strong development in the field of artificial intelligence, robots are gradually being used in production activities and human life, robots can completely replace humans in difficult and dangerous jobs
Khoi et al / Indian Journal of Science and Technology 2021;14(30):2460–2471 robotics research is attracting the interest of many researchers ever, many robot models are designed to resemble human shapes so that based on the human gait, the robot will be controlled in the most convenient way
A model of a biped robot is built to evaluate the algorithm with the shape of human legs, each leg of the robot has 3 degrees of freedom that can rotate in joint limits

Summary

Introduction

In the development of digital technology, the digital age, the strong development in the field of artificial intelligence, robots are gradually being used in production activities and human life, robots can completely replace humans in difficult and dangerous jobs. The work efficiency of robots is getting better and better, gradually helping people not have to perform works that are dangerous to health. Robots can be sociable with humans, interact directly with people nowadays. With these benefits, Khoi et al / Indian Journal of Science and Technology 2021;14(30):2460–2471 robotics research is attracting the interest of many researchers ever, many robot models are designed to resemble human shapes so that based on the human gait, the robot will be controlled in the most convenient way. Learning, designing, and controlling a robot with a human shape, able to perform humanlike movement will help the robot move more flexibly on more complex terrain. There are many methods used to control biped robots, for traditional solution methods[5,6], we need to set up the Denavit-

Objectives

Results

Conclusion