Abstract

The Traveling Salesman Problem (TSP) is a classical NP-hard combinatorial optimization problem with many practical variants. The Lin–Kernighan–Helsgaun (LKH) algorithm is one of the state-of-the-art local search algorithms for the TSP. LKH-3 is a powerful extension of LKH that can solve many TSP variants. Both LKH and LKH-3 use the candidate sets to improve their efficiency, and have two different methods, α-measure and POPMUSIC, to decide the candidate sets. In this work, we first propose a Variable Strategy Reinforced LKH (VSR-LKH) algorithm, which incorporates three reinforcement learning methods (Q-learning, Sarsa, and Monte Carlo) with LKH, for the TSP. We further propose a new algorithm called VSR-LKH-3 that combines the variable strategy reinforcement learning method with LKH-3 for typical TSP variants, including the TSP with Time Windows (TSPTW) and the Colored TSP (CTSP). The proposed algorithms replace the inflexible traversal operations in LKH and LKH-3 and let the algorithms learn to make a choice at each search step by reinforcement learning. Both LKH and LKH-3, with either the α-measure or the POPMUSIC method, can be significantly improved by our methods. Extensive experiments on 236 widely-used TSP benchmarks with up to 85,900 cities demonstrate the excellent performance of VSR-LKH. VSR-LKH-3 also significantly outperforms the state-of-the-art heuristics for the TSPTW and CTSP.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call