We tackle the classical traveling salesman problem (TSP) by combining a graph neural network and Monte Carlo Tree Search. We adopt a greedy algorithm framework to derive a promising tour by adding the vertices successively. A graph neural network is trained to capture graph motifs and interactions between vertices, and then to give the prior probability of selecting a vertex at every step. Instead of making decisions directly based on the output of graph neural networks, we combine the graph neural network with Monte Carlo Tree Search to provide a more reliable policy as the output of the latter is the feedback information by fusing the prior probability with the scouting exploration. Without much heuristic designing, our approach outperforms recent state-of-the-art learning-based methods on the TSP. Experimental results demonstrate that the proposed method can be generalized to instances with more vertices than those used during the training.
Read full abstract