Abstract

On-demand delivery has become prevailing for people to order meals and groceries online, especially during the pandemic. It is essential to dispatch massive orders to limited couriers to satisfy on-demand delivery users, especially during peak hours. Existing studies mainly focus on order dispatching within a region, and they are challenging to be applied to the cross-region courier displacement problem due to (i) unique practical factors, including regional spatial-temporal demand-supply dynamics and strict delivery time constraints, and (ii) the large-scale setting and high-dimensional decision space given massive couriers in on-demand delivery. To address these challenges, in this work, we propose an efficient cross-region courier displacement framework, i.e., Courier Displacement Reinforcement Learning (short for CDRL) based on centralized multi-agent actor-critic, which first design the actor-critic network with a time-varying displacement intensity control module to capture demand-supply dynamics and utilize the centralized training and decentralized execution multi-agent framework to address the large-scale coordination. One-month real-world order records collected from one of the biggest on-demand delivery services in the world are utilized to show the performance of our design. The extensive results show that our method offers a 47.97% of increase in balancing supply and demand and reduces idle ride time by 24.62% simultaneously.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call