This paper considers the problem of real-time dispatching of a fleet of automated guided vehicles (AGVs) with battery constraints. AGVs must be immediately assigned to transport requests, which arrive randomly. In addition, the AGVs must be repositioned and recharged, awaiting future transport requests. Each transport request has a soft time window with late delivery incurring a tardiness cost. This research aims to minimize the total costs, consisting of tardiness costs of transport requests and travel costs of AGVs. We extend the existing literature by making a distinction between parking and charging nodes, where AGVs wait idle for incoming transporting requests and satisfy their charging needs, respectively. Also, we formulate this online decision-making problem as a Markov decision process and propose a solution approach based on deep reinforcement learning. To assess the quality of the proposed approach, we compare it with the optimal solution of a mixed-integer linear programming model that assumes full knowledge of transport requests in hindsight and hence serves as a lower-bound on the costs. We also compare our solution with a heuristic policy used in practice. We assess the performance of the proposed solutions in an industry case study using real-world data.
Read full abstract