The automated guided vehicle (AGV) dispatching problem is to develop a rule to assign transportation tasks to certain vehicles. This article proposes a new deep reinforcement learning approach with a self-attention mechanism to dynamically dispatch the tasks to AGV. The AGV dispatching system is modeled as a less complicated Markov decision process (MDP) using vehicle-initiated rules to dispatch a workcenter to an idle AGV. In order to deal with the highly dynamical environment, the self-attention mechanism is introduced to calculate the importance of different information. The invalid action masking technique is performed to alleviate false actions. A multimodal structure is employed to mix the features of various sources. Comparative experiments are performed to show the effectiveness of the proposed method. The properties of the learned policies are also investigated under different environment settings. It is discovered that the policies explore and learn the properties of different systems, and also smooth the traffic congestion. Under certain environment settings, the policy converges to a heuristic rule that assigns the idle AGV to the workcenter with the shortest queue length, which shows the adaptiveness of the proposed method.
Read full abstract