Abstract

Recent advances in game-playing AI that combine neural networks with Monte Carlo Tree Search (MCTS) through self-play have remarkably improved the performance when dealing with challenging board games such as Go and Chess. Such a combination shows two significant merits: handling large state space and learning with little human knowledge. The usage of neural networks makes it possible to encode and store large amounts of state information into a finite structure. Meanwhile, the process of self-play generates a self-supervised curriculum that makes learning from scratch possible. In this dissertation, being motivated by the idea of AlphaZero, we investigate how to apply the neural MCTS algorithm to more general but practical problems (such as math puzzles, QSAT, and model checking). Our primary methodology is to define the target problem as some form of a logical expression (in recursive first-order logic) and then transform it into a semantic game so that it can be played and learned with a neural MCTS algorithm. We build a framework that uses recursive first-order logic as input statements to generate corresponding semantic games to play and learn. The framework user can then map the learned winning strategy back to the verification/falsification of the original statement. Besides that, we have also noticed several deficiencies in the original neural MCTS algorithm once applied to semantic games, caused by the asymmetry, which leads to an imbalanced learning experience for the two players and eventually makes the algorithm fall into a local sub-optimal solution quickly. To mitigate such an issue, we propose several modifications to make the algorithm better fit our framework based on the connection with reinforcement learning (RL) and double oracle (DO).--Author's abstract

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.