Abstract

The puzzle game 2048, a single-player stochastic game played on a 4 × 4 grid, is the most popular among similar slide-and-merge games. One of the strongest computer players for 2048 uses temporal difference learning (TD learning) on so called N-tuple networks, where the shapes of the N-tuples are given by human based on characteristics of the game. In our previous work (Oka and Matsuzaki, 2016), the authors proposed a systematic method of selecting N-tuples under an assumption that the interinfluence among those N-tuple networksn are negligible. Though the selected N-tuple networks worked fine, there were large gaps between those N-tuple networks and the human-designed networks. In this paper, another systematic and game-characteristics-free method of selecting N-tuples is proposed for game 2048, in which the interinfluence among those N-tuple networks is captured. The proposed method is effective and generic: the selected N-tuple networks are as good as human-designed ones under the same setting, and we can obtain larger (or smaller) N-tuple networks in the same manner. We also report the experiment results when we combine the N-tuple networks and expectimax search.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call