Adaptive generative adversarial maximum entropy inverse reinforcement learning

Li Song,Dazi Li,Xin Xu

doi:10.1016/j.ins.2024.121712

Abstract

Maximum entropy inverse reinforcement learning algorithms have been extensively studied for learning rewards and optimizing policies using expert demonstrations. However, high-dimensional features and limited or non-optimal expert demonstrations can easily lead to overfitting, gradient vanishing, gradient exploding, and low convergence. To address these challenges, an adaptive generative adversarial maximum entropy inverse reinforcement learning algorithm is proposed, termed AGA-MEIRL. This algorithm can learn rewards and optimize policies with updated mixed expert demonstrations. The primary contribution is that the adaptive generative adversarial network (AdaGAN) helps potentially weak individual predictors aggregate into a strong composite predictor, thus solving mode collapse and overfitting problems of the discriminator in learning rewards. To solve the gradient vanishing problem, the activation function SELU is selected in AGA-MEIRL. Additionally, the gradient clipping method is introduced into AGA-MEIRL to tackle the gradient exploding problem, enhancing the algorithm’s stability and preventing data overflow. The convergence analysis of AGA-MEIRL is established based on the upper bound of the AdaGAN. Experimental results on the benchmark and rolling bearing fault diagnosis experiments demonstrate that AGA-MEIRL achieves superior rewards and success rates, effectively solving existing problems and outperforming current MEIRL approaches in learning rewards and policies from mixed expert demonstrations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Adaptive generative adversarial maximum entropy inverse reinforcement learning

Abstract

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Similar Papers

Sparse online maximum entropy inverse reinforcement learning via proximal optimization and truncated gradient
Li Song ... Xin Xu
Knowledge-Based Systems | VOL. 252
Li Song, et. al.Li Song ... Xin Xu
16 Jul 2022
Knowledge-Based Systems | VOL. 252

Wasserstein Distance guided Adversarial Imitation Learning with Reward Shape Exploration
Ming Zhang ... Xiaoteng Ma
-
Ming Zhang, et. al.Ming Zhang ... Xiaoteng Ma
20 Nov 2020
20 Nov 2020

Goal Conditioned Generative Adversarial Imitation Learning Based on Dueling-DQN
Ziqi Xu ... Shaofan Wang
-
Ziqi Xu, et. al.Ziqi Xu ... Shaofan Wang
01 Jan 2023
01 Jan 2023

Learning Reward Models for Cooperative Trajectory Planning with Inverse Reinforcement Learning and Monte Carlo Tree Search
Karl Kurzer ... J Marius Zollner
-
Karl Kurzer, et. al.Karl Kurzer ... J Marius Zollner
05 Jun 2022
05 Jun 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Adaptive generative adversarial maximum entropy inverse reinforcement learning

Abstract

Talk to us

Similar Papers

More From: Information Sciences