Abstract

Dynamic routing problem with crowd-sourced driver can be challenging as the dynamically arrived crowd drivers are utilized to meet the dynamically placed customer demand. To guarantee a certain service level, the company also operates a set of employed vehicles as a backup capacity to serve the tasks for which the use of a crowd driver is not feasible or not efficient. Thus, in this paper, we study a special dynamic routing problem that arises in such an environment, in which a mix of company vehicles and dynamically arrived crowd drivers serves dynamically placed orders. The company seeks route policies for crowd drivers and company vehicles so as to minimize the total costs. We present a hierarchical reinforcement learning method that combines upper-level agent for balancing the opportunities and risks brought by delayed batch-matching and lower-level agents for route planning. The effectiveness of our approach is demonstrated through extensive numerical analyses, with the results showing the benefits derived from the upper-level agent and the designed dual-decoder architecture of lower-level agent. The solution quality, computation time, and generalization ability of the proposed approach are all proven to be superior over existing methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call