Deep Reinforcement Learning for Solving AGVs Routing Problem

Chengxuan Lu,Yong Gu,Zichao Xing,Yisheng Huang,Weimin Wu,Jiliang Luo,Jinjun Long

doi:10.1007/978-3-030-65955-4_16

Abstract

The routing of automated guided vehicles (AGVs) is playing an increasingly important role in modern logistics. AGVs routing problem is a complex combinatorial optimization problem. It fails to get the desired results of solving this problem using meta-heuristic algorithms due to its high real-time demand. Large AGVs systems in engineering are usually simplified by adding regulations, which may lead to getting only sub-optimal solutions. In this paper, we present a deep reinforcement learning algorithm to solve the AGVs routing problem. Firstly, the AGVs routing problem is modeled by a Markov decision process (MDP), enabling real-time routing. Secondly, according to the properties of the working scene of AGVs, asynchronous DQN (deep Q-network) is exploited to serve as the base framework of reinforcement learning. More importantly, the map of the working scene is discretized and represented using the embedding technique. Compared with one-hot mode, the input size of the embedding mode is much smaller, greatly improving the training speed. The extracted embeddings are built into conflict vectors, which are finally processed by LSTM (long short-term memory). Experiments show that the proposed algorithm has effectiveness both in real-time responding speed and getting high-quality solutions.

Full Text