Automatic Temperature Parameter Tuning for Reinforcement Learning Using Path Integral Policy Improvement.

Hiroyasu Nakano,Toru Asai,Shun-Ichi Azuma,Ryo Ariizumi

doi:10.1109/tnnls.2023.3312857

Abstract

In this article, we propose a novel variant of path integral policy improvement with covariance matrix adaptation ( [Formula: see text] - [Formula: see text] ), which is a reinforcement learning (RL) algorithm that aims to optimize a parameterized policy for the continuous behavior of robots. [Formula: see text] - [Formula: see text] has a hyperparameter called the temperature parameter, and its value is critical for performance; however, little research has been conducted on it and the existing method still contains a tunable parameter, which can be critical to performance. Therefore, tuning by trial and error is necessary in the existing method. Moreover, we show that there is a problem setting that cannot be learned by the existing method. The proposed method solves both problems by automatically adjusting the temperature parameter for each update. We confirmed the effectiveness of the proposed method using numerical tests.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic Temperature Parameter Tuning for Reinforcement Learning Using Path Integral Policy Improvement.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on neural networks and learning systems

Lead the way for us

Journal: IEEE transactions on neural networks and learning systems	Publication Date: Jan 1, 2024
License type: CC BY-NC-ND 4.0

Similar Papers

Optimising a Real-Time Scheduler for Railway Lines using Policy Search
...
-
, et. al. ...
03 May 2020
03 May 2020

Simulating human walking: a model-based reinforcement learning approach with musculoskeletal modeling.
Binbin Su ... Elena M Gutierrez-Farewik
Frontiers in Neurorobotics | VOL. 17
Binbin Su, et. al.Binbin Su ... Elena M Gutierrez-Farewik
12 Oct 2023
Frontiers in Neurorobotics | VOL. 17

Evolution Strategy with Covariance Matrix and decreasing step-size Adaptation (CMDSA-ES)
Zhuo Yang ... Xi Chen
-
Zhuo Yang, et. al. Zhuo Yang ... Xi Chen
01 Aug 2016
01 Aug 2016

Efficient covariance matrix update for variable metric evolution strategies
Thorsten Suttorp ... Christian Igel
Machine Learning | VOL. 75
Thorsten Suttorp, et. al.Thorsten Suttorp ... Christian Igel
24 Jan 2009
Machine Learning | VOL. 75

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Temperature Parameter Tuning for Reinforcement Learning Using Path Integral Policy Improvement.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on neural networks and learning systems