Model-Free Deep Inverse Reinforcement Learning by Logistic Regression

Eiji Uchibe

doi:10.1007/s11063-017-9702-7

Abstract

This paper proposes model-free deep inverse reinforcement learning to find nonlinear reward function structures. We formulate inverse reinforcement learning as a problem of density ratio estimation, and show that the log of the ratio between an optimal state transition and a baseline one is given by a part of reward and the difference of the value functions under the framework of linearly solvable Markov decision processes. The logarithm of density ratio is efficiently calculated by binomial logistic regression, of which the classifier is constructed by the reward and state value function. The classifier tries to discriminate between samples drawn from the optimal state transition probability and those from the baseline one. Then, the estimated state value function is used to initialize the part of the deep neural networks for forward reinforcement learning. The proposed deep forward and inverse reinforcement learning is applied into two benchmark games: Atari 2600 and Reversi. Simulation results show that our method reaches the best performance substantially faster than the standard combination of forward and inverse reinforcement learning as well as behavior cloning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Neural Processing Letters	Publication Date: Sep 8, 2017
Citations: 33	License type: open-access

R Discovery Prime

R Discovery Prime

Model-Free Deep Inverse Reinforcement Learning by Logistic Regression

Abstract

Talk to us

Similar Papers

More From: Neural Processing Letters

Lead the way for us

Similar Papers

Deep Inverse Reinforcement Learning by Logistic Regression
Eiji Uchibe
-
Eiji UchibeEiji Uchibe
01 Jan 2015
01 Jan 2015

Deep Inverse Reinforcement Learning for Behavior Prediction in Autonomous Driving: Accurate Forecasts of Vehicle Motion
Tharindu Fernando ... Clinton Fookes
IEEE Signal Processing Magazine | VOL. 38
Tharindu Fernando, et. al.Tharindu Fernando ... Clinton Fookes
28 Dec 2020
IEEE Signal Processing Magazine | VOL. 38

Generative Adversarial Immitation Learning for Steering an Unmanned Surface Vehicle
Alexandra Vedeler ... Narada Warakagoda
Proceedings of the Northern Lights Deep Learning Workshop | VOL. 1
Alexandra Vedeler, et. al.Alexandra Vedeler ... Narada Warakagoda
07 Feb 2020
Proceedings of the Northern Lights Deep Learning Workshop | VOL. 1

Joint path planning and power allocation of a cellular-connected UAV using apprenticeship learning via deep inverse reinforcement learning
Alireza Shamsoshoara ... İsmail Güvenç
Computer Networks | VOL. 254
Alireza Shamsoshoara, et. al.Alireza Shamsoshoara ... İsmail Güvenç
12 Sep 2024
Computer Networks | VOL. 254

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Model-Free Deep Inverse Reinforcement Learning by Logistic Regression

Abstract

Talk to us

Similar Papers

More From: Neural Processing Letters