Only-One-Victor Pattern Learning in Computer Go

Jiao Wang,Tan Zhu,Chu-Husan Hsueh,Chenjun Xiao,I-Chen Wu,Wen-Jie Tseng

doi:10.1109/tciaig.2015.2504108

Abstract

Automatically acquiring domain knowledge from professional game records, a kind of pattern learning, is an attractive and challenging issue in computer Go. This paper proposes a supervised learning method, by introducing a new generalized Bradley-Terry model, named Only-One-Victor, to learn patterns from game records. Basically, our algorithm applies the same idea with Elo rating algorithm, which considers each move in game records as a group of move patterns, and the selected move as the winner of a kind of competition among all groups on current board. However, being different from the generalized Bradley-Terry model for group competition used in Elo rating algorithm, Only-One-Victor model in our work simulates the process of making selection from a set of possible candidates by considering such process as a group of independent pairwise comparisons. We use a graph theory model to prove the correctness of Only-One-Victor model. In addition, we also apply the Minorization-Maximization (MM) to solve the optimization task. Therefore, our algorithm still enjoys many computational advantages of Elo rating algorithm, such as the scalability with high dimensional feature space. With the training set containing 115,832 moves and the same feature setting, the results of our experiments show that Only-One-Victor outperforms Elo rating, a well-known best supervised pattern learning method.

Full Text