Abstract

As an important part of intelligent transportation systems, On-demand Food Delivery (OFD) becomes a prevalent logistics service in modern society. With the continuously increasing scale of transactions, rider-centered assignment manner is gaining more attraction than traditional platform-centered assignment among food delivery companies. However, problems such as dynamic arrivals of orders, uncertain rider behaviors and various false-negative feedbacks inhibit the platform to make a proper decision in the interaction process with riders. To address such issues, we propose an online Deep Reinforcement Learning-based Order Recommendation (DRLOR) framework to solve the decision-making problem in the scenario of OFD. The problem is modeled as a Markov Decision Process (MDP). The DRLOR framework mainly consists of three networks, i.e., the actor-critic network that learns an optimal order ranking policy at each interaction step, the rider behavior prediction network that predicts the grabbing behavior of riders and the feedback correlation network based on attention mechanism that identifies valid feedback information from false feedbacks and learns a high-dimensional state embedding to represent the states of riders. Extensive offline and online experiments are conducted on Meituan delivery platform and the results demonstrate that the proposed DRLOR framework can significantly shorten the length of interactions between riders and the platform, leading to a better experience of both riders and customers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call