Abstract

Reinforcement learning-based methods have shown great potential in solving combinatorial optimization problems. However, the related research has not been mature in terms of both models and training methods. This paper proposes a method based on reinforcement learning and contrastive self-supervised learning. To be specific, the proposed method uses an attention model to learn a policy for generating solutions and combines a contrastive self-supervised learning model to learn the attention encoder in the way of node-by-node. Correspondingly, a two-phase learning method, including node-wise learning and solution-wise learning, is adopted to train the attention model and the contrastive self-supervised model jointly and collaboratively. The performance of the proposed method has been verified by numerical experiments on various combinatorial optimization problems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call