Backward Q-learning: The combination of Sarsa algorithm and Q-learning

Yin-Hao Wang,Tzuu-Hseng S Li,Chih-Jui Lin

doi:10.1016/j.engappai.2013.06.016

Abstract

Reinforcement learning (RL) has been applied to many fields and applications, but there are still some dilemmas between exploration and exploitation strategy for action selection policy. The well-known areas of reinforcement learning are the Q-learning and the Sarsa algorithms, but they possess different characteristics. Generally speaking, the Sarsa algorithm has faster convergence characteristics, while the Q-learning algorithm has a better final performance. However, Sarsa algorithm is easily stuck in the local minimum and Q-learning needs longer time to learn. Most literatures investigated the action selection policy. Instead of studying an action selection strategy, this paper focuses on how to combine Q-learning with the Sarsa algorithm, and presents a new method, called backward Q-learning, which can be implemented in the Sarsa algorithm and Q-learning. The backward Q-learning algorithm directly tunes the Q-values, and then the Q-values will indirectly affect the action selection policy. Therefore, the proposed RL algorithms can enhance learning speed and improve final performance. Finally, three experimental results including cliff walk, mountain car, and cart–pole balancing control system are utilized to verify the feasibility and effectiveness of the proposed scheme. All the simulations illustrate that the backward Q-learning based RL algorithm outperforms the well-known Q-learning and the Sarsa algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Backward Q-learning: The combination of Sarsa algorithm and Q-learning

Abstract

Talk to us

Similar Papers

More From: Engineering Applications of Artificial Intelligence

Lead the way for us

Journal: Engineering Applications of Artificial Intelligence	Publication Date: Aug 12, 2013
Citations: 121

Similar Papers

Variational Bayesian Exploration-Based Active Sarsa Algorithm
Qiming Fu ... Jianping Chen
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 33
Qiming Fu, et. al.Qiming Fu ... Jianping Chen
01 Sep 2019
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 33

Exploration on Obstacle Avoidance and Study of Balance
Wang Qi-Ming ... Liu Jian-Fen
International Journal of Grid and Distributed Computing | VOL. 9
Wang Qi-Ming, et. al.Wang Qi-Ming ... Liu Jian-Fen
31 Mar 2016
International Journal of Grid and Distributed Computing | VOL. 9

Iterative SARSA: The Modified SARSA Algorithm for Finding the Optimal Path
Prajval Mohan* ... Simran Koul
International Journal of Recent Technology and Engineering (IJRTE) | VOL. 8
Prajval Mohan*, et. al.Prajval Mohan* ... Simran Koul
30 Mar 2020
International Journal of Recent Technology and Engineering (IJRTE) | VOL. 8

QTAccel
Rachit Rajat ... Rajgopal Kannan
-
Rachit Rajat, et. al.Rachit Rajat ... Rajgopal Kannan
23 Feb 2020
23 Feb 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Backward Q-learning: The combination of Sarsa algorithm and Q-learning

Abstract

Talk to us

Similar Papers

More From: Engineering Applications of Artificial Intelligence