Spiking Neural Network Discovers Energy-Efficient Hexapod Motion in Deep Reinforcement Learning

Katsumi Naya,Mitsuhiro Hayashibe,Kyo Kutsuzawa,Dai Owaki

doi:10.1109/access.2021.3126311

Abstract

In Deep Reinforcement Learning (DRL) for robotics application, it is important to find energy-efficient motions. For this purpose, a standard method is to set an action penalty in the reward to find the optimal motion considering the energy expenditure. This method is widely used for the simplicity of implementation. However, since the reward is a linear sum, if the penalty is too large, the system will fall into local minima and no moving solution can be obtained. In contrast, if the penalty is too small, the effect may not be sufficient. Therefore, it is necessary to adjust the amount of the penalty so that the agent always moves dynamically, and the energy-saving effect is sufficient. Nevertheless, since adjusting the hyperparameters is computationally expensive, we need a learning method that is robust to the penalty setting problem. We investigated on the Spiking Neural Network (SNN), which has been attracting attention for its computational efficiency and neuromorphic architecture. We conducted gait experiments using a hexapod agent while varying the energy penalty settings in the simulation environment. By applying SNN to the conventional state-of-the-art DRL algorithms, we examined whether the agent could explore for an optimal gait with a larger penalty variation and obtain an energy-efficient gait verified with Cost of Transport (CoT), a metric of energy efficiency for gait. Soft Actor-Critic (SAC)+SNN resulted in a CoT of 1.64, Twin Delayed Deep Deterministic policy gradient (TD3)+SNN resulted in a CoT of 2.21, and Deep Deterministic policy gradient (DDPG)+SNN resulted in a CoT of 2.08 (1.91 for normal SAC, 2.38 for TD3, and 2.40 for DDPG). DRL combined with SNN succeeded in learning more energy efficient gait with lower CoT.

Highlights

E NERGY-EFFICIENT control is an important aspect in the field of robotics as the energy resource is limited for autonomous mobile robots
One standard way is to add an action penalty term to the reward function by multiplying the agent’s action by a weight coefficient for considering the energy expenditure. This method can be practically applied to any Deep Reinforcement Learning (DRL) algorithm because it only adds a term to the reward function, and it is reported to be effective in preventing overfitting [4]
The effect of Spiking Neural Network (SNN)-driven DRL was investigated over different DRL algorithms and evaluated for the energy efficiency of the hexapod gait using cost of transport (CoT)

Summary

Introduction

E NERGY-EFFICIENT control is an important aspect in the field of robotics as the energy resource is limited for autonomous mobile robots. One standard way is to add an action penalty term to the reward function by multiplying the agent’s action by a weight coefficient for considering the energy expenditure. This method can be practically applied to any DRL algorithm because it only adds a term to the reward function, and it is reported to be effective in preventing overfitting [4]. SAC and TD3 are known as state-of-the-art DRL algorithms

Methods

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Spiking Neural Network Discovers Energy-Efficient Hexapod Motion in Deep Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Toward robust and scalable deep spiking reinforcement learning.
Mahmoud Akl ... Alois Knoll
Frontiers in Neurorobotics | VOL. 16
Mahmoud Akl, et. al.Mahmoud Akl ... Alois Knoll
20 Jan 2023
Frontiers in Neurorobotics | VOL. 16

Space Manipulator Assembly Operation Technique based on Deep Residual Reinforcement Learning
Kui Huang ... Binyan Liang
Journal of Physics: Conference Series | VOL. 2405
Kui Huang, et. al.Kui Huang ... Binyan Liang
01 Dec 2022
Journal of Physics: Conference Series | VOL. 2405

Online Virtual Training in Soft Actor-Critic for Autonomous Driving
Maryam Savari ... Yoonsuck Choe
-
Maryam Savari, et. al.Maryam Savari ... Yoonsuck Choe
18 Jul 2021
18 Jul 2021

Collision-avoidance under COLREGS for unmanned surface vehicles via deep reinforcement learning
Yong Ma ... Yuanzhou Zheng
Maritime Policy & Management | VOL. 47
Yong Ma, et. al.Yong Ma ... Yuanzhou Zheng
12 May 2020
Maritime Policy & Management | VOL. 47

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spiking Neural Network Discovers Energy-Efficient Hexapod Motion in Deep Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access