Proximal Policy Optimization With Time-Varying Muscle Synergy for the Control of an Upper Limb Musculoskeletal System

Rong Liu,Jason Gu,Yongxuan Wang,Jiaxing Wang,Yin Liu,Yaru Chen

doi:10.1109/tase.2023.3254583

Abstract

Because of their unique adaptability, flexibility, and robustness, musculoskeletal robotic systems are regarded potentially as next-generation robots. However, motion learning and generation of such a robotic system are still challenging. This paper presents a neuromuscular control method, namely, TMS-PPO, based on time-varying muscle synergy (TMS) and proximal policy optimization (PPO). The electromyogram (EMG) activation signals of actual human motions are decomposed to obtain TMSs based on the temporal properties of the TMS. The weights of networks are trained to generate the scale and phase coefficients through the PPO. The coefficients modulate the TMSs to generate appropriate activation patterns to optimize motion learning of the musculoskeletal system. To verify the effectiveness of the proposed method, the TMSs are extracted from human upper limb muscle activation signals, and we compare TMS-PPO with PPO in the motion learning and generation process of an upper limb musculoskeletal system. The results show that TMS-PPO can complete the control tasks because the average errors of the joints are less than 0.05 rad. In the meantime, TMSs are used as motion primitives of the musculoskeletal system to simulate the process of the human CNS controlling muscles. It shows that TMS-PPO reduces the energy consumption and improves the learning rate significantly compared with the PPO. The learning episodes reduce from <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">\(10^4\)</tex-math> </inline-formula> to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">\(10^3\)</tex-math> </inline-formula> , which indicates that TMS-PPO has a stronger learning ability and better physiological explanation. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Note to Practitioners</i> —Due to the superiorities of the musculoskeletal system, humanoid robots that imitate human driven mechanisms are vigorously carried out worldwide. Taking advantages of human-like characteristics, the musculoskeletal robot provides new opportunities to understand and validate the human mechanisms of muscle control and motion learning, to compare the performance of the robot to that of humans as well as work in real world, e.g., human interactive robots, amusement robots and medical training robots in the future. However, strong redundancy, coupling, and nonlinearity of the system also raises many challenges for the investigation of the control problem. Inspired by how the human CNS controls a musculoskeletal system and realize motion generalization, a novel muscle-synergies-based neuromuscular control that combines time-varying muscle synergy (TMS) and Proximal Policy Optimization (PPO), namely, TMS-PPO is proposed in this paper. The learning efficiency of PPO and the physiological interpretation of the control process are improved during the motion learning and generation processes of the musculoskeletal system. Preliminary simulation experiments suggest that this method is feasible in terms of control accuracy and efficiency. Moreover, the performance of the TMS-PPO is comparable to the PPO without significant improvement. To solve this problem, in future work, we will introduce the cerebellar model into the control method which plays the role of adjusting and correcting the motions of the limbs to achieve accurate and stable control in the actions process of humans.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Proximal Policy Optimization With Time-Varying Muscle Synergy for the Control of an Upper Limb Musculoskeletal System

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Automation Science and Engineering

Lead the way for us

Journal: IEEE Transactions on Automation Science and Engineering	Publication Date: Apr 1, 2024
Citations: 4

Similar Papers

Two-stage fuzzy object grasping controller for a humanoid robot with proximal policy optimization
Ping-Huan Kuo ... Kuan-Lin Chen
Engineering Applications of Artificial Intelligence | VOL. 125
Ping-Huan Kuo, et. al.Ping-Huan Kuo ... Kuan-Lin Chen
03 Jul 2023
Engineering Applications of Artificial Intelligence | VOL. 125

Muscle-Synergies-Based Neuromuscular Control for Motion Learning and Generalization of a Musculoskeletal System
Jiahao Chen ... Hong Qiao
IEEE Transactions on Systems, Man, and Cybernetics: Systems | VOL. 51
Jiahao Chen, et. al.Jiahao Chen ... Hong Qiao
07 Feb 2020
IEEE Transactions on Systems, Man, and Cybernetics: Systems | VOL. 51

Deep-reinforcement-learning-based gait pattern controller on an uneven terrain for humanoid robots
Ping-Huan Kuo ... Her-Terng Yau
International Journal of Optomechatronics | VOL. 17
Ping-Huan Kuo, et. al.Ping-Huan Kuo ... Her-Terng Yau
15 Jun 2023
International Journal of Optomechatronics | VOL. 17

Deep Reinforcement Learning for Humanoid Robot Dribbling
Alexandre F V Muzio ... Takashi Yoneyama
-
Alexandre F V Muzio, et. al.Alexandre F V Muzio ... Takashi Yoneyama
09 Nov 2020
09 Nov 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Proximal Policy Optimization With Time-Varying Muscle Synergy for the Control of an Upper Limb Musculoskeletal System

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Automation Science and Engineering