Abstract

Vehicle routing problem (VRP) is one of the classic combinatorial optimization problems where an optimal tour to visit customers is required with a minimum total cost in the presence of some constraints. Recently, VRP is being solved with the use of deep reinforcement learning (DRL), with node sets considered (represented) as a graph structure. Existing Transformer based DRL solutions for VRP rely only on node information ignoring the role of information on the edges between nodes in the graph structure. In this paper, we proposed an attention‐based end‐to‐end DRL model to solve VRP which embeds edge information between nodes for rich graph representation learning. We used Transformer based encoder‐decoder framework with an edge information embedded multi‐head attention (EEMHA) layer in the encoder. The EEMHA‐based encoder learns underlying structure of the graph and generates an expressive graph topology representation by merging node and edge information. We trained our model using proximal policy optimization (PPO) with some code‐level optimization techniques. We conducted an experiment on randomly generated instances and on a real‐world data generated from road networks to verify the performance of our proposed model. The result from all experiments show that our model performs better than the existing DRL methods and most of the conventional heuristics with good generalizability from random instance training to real‐world instance testing of different problem size. © 2023 Institute of Electrical Engineers of Japan. Published by Wiley Periodicals LLC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call