Monte Carlo tree search with temporal-difference learning for general video game playing

Ercument Ilhan,A Sima Etaner-Uyar

doi:10.1109/cig.2017.8080453

Abstract

General Video Game Playing (GVGP) is a problem where the objective is to create an agent that can play multiple games with different properties successfully with no prior knowledge about them. Being an important sub-field in General Artificial Intelligence, GVGP has drawn a considerable amount of interest, and the research in this field got intensified with the release of General Video Game AI framework and competition. As of today, even though this problem has been approached with many different techniques, it is still far from being solved. Monte Carlo Tree Search (MCTS) is one of the most promising baseline approaches in literature. In this study, MCTS algorithm is enhanced with a recently developed temporal- difference learning method, namely True Online Sarsa(lambda) to make it able to exploit domain knowledge by using past experience. Experiments show that the proposed modifications improve the performance of MCTS significantly in GVGP, and applications of reinforcement learning techniques in this domain is a promising subject that needs to be further researched.

Full Text