Vehicle routing optimization is a crucial responsibility of transportation service providers, which can significantly reduce operating expenses and improve client satisfaction. Learning to tackle routing optimization problems automatically can be the next significant step forward in optimization technology. Despite recent advancements in automatically learned heuristics for routing optimization problems, state-of-the-art traditional methods such as Lin-Kernighan-Helsgaun (LKH) still outperform machine learning-based approaches. To narrow this gap, we propose a novel technique called self-supervised reinforcement learning (SSRL), which combines self-supervised learning with the LKH heuristic. We provide a node decoder and an edge decoder corresponding to reinforcement learning and self-supervised learning for learning node penalties and edge scores, respectively. The self-supervised part with cross-entropy loss offers strong gradient signals for parameter updates. At the same time, the reinforcement learning component functions as a regularizer to drive the supervised part, which focuses on particular rewards. SSRL learns and replicates all of the LKH’s significant components, improving the original LKH’s generalization and performance. Through experiments on multiple vehicle routing problems, SSRL has demonstrated superior accuracy and efficiency compared to existing methods. Our results provide empirical evidence of SSRL’s effectiveness and potential as a promising solution for optimizing complex routing problems.