Abstract

Humanoid robots are equipped with humanoid arms to make them more acceptable to the general public. Humanoid robots are a great challenge in robotics. The concept of digital twin technology complies with the guiding ideology of not only Industry 4.0, but also Made in China 2025. This paper proposes a scheme that combines deep reinforcement learning (DRL) with digital twin technology for controlling humanoid robot arms. For rapid and stable motion planning for humanoid robots, multitasking-oriented training using the twin synchro-control (TSC) scheme with DRL is proposed. For switching between tasks, the robot arm training must be quick and diverse. In this work, an approach for obtaining a priori knowledge as input to DRL is developed and verified using simulations. Two simple examples are developed in a simulation environment. We developed a data acquisition system to generate angle data efficiently and automatically. These data are used to improve the reward function of the deep deterministic policy gradient (DDPG) and quickly train the robot for a task. The approach is applied to a model of the humanoid robot BHR-6, a humanoid robot with multiple-motion mode and a sophisticated mechanical structure. Using the policies trained in the simulations, the humanoid robot can perform tasks that are not possible to train with existing methods. The training is fast and allows the robot to perform multiple tasks. Our approach utilizes human joint angle data collected by the data acquisition system to solve the problem of a sparse reward in DRL for two simple tasks. A comparison with simulation results for controllers trained using the vanilla DDPG show that the designed controller developed using the DDPG with the TSC scheme have great advantages in terms of learning stability and convergence speed.

Highlights

  • Humanoid robots have recently become a focus in academic research

  • We propose a scheme for rapid and accurate motion planning based on deep reinforcement learning (DRL) with twin synchro-control (TSC)

  • We combined the deep deterministic policy gradient (DDPG) algorithm with a sensor-based hardware system for Standard The deviation the training steps experiment with different human data 34

Read more

Summary

Introduction

State-of-the-art humanoid robots are capable of working alongside humans. They can stabilize themselves in a practical work environment and use their arms for simple tasks, such as lifting a box, using power tools, and maintaining their balance. These tasks usually require a researcher or engineer to carefully design the trajectory for the robot. Traditional robotic arm motion planning schemes are time-consuming and require researchers to have expertise in mathematics, kinematics, inverse kinematics, and other areas. Universal dual-arm robots usually take a long time to learn a specific action. A task-oriented arm Sensors 2020, 20, 3515; doi:10.3390/s20123515 www.mdpi.com/journal/sensors

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.