Optimal Learning for Urban Delivery Fleet Allocation

Yixiao Huang,Yue Tong,Lei Zhao,Ilya O Ryzhov,Warren B Powell

doi:10.1287/trsc.2018.0861

Abstract

In a two-tiered city logistics system, an urban logistics company usually partitions the urban area into regions and allocates its delivery fleet (e.g., vehicles, couriers) to these regions. On a daily basis, the delivery station in each region receives the delivery packages from the city distribution centers and delivers them to customers within the region, using its allocated delivery vehicles. A tactical decision in such a city logistics system is the allocation of its delivery fleet to the regions to minimize the expected operational cost of the entire system. However, because of the complexity of the urban delivery operations and the day-to-day variance of the customer demand, an accurate evaluation of the expected operational cost associated with an allocation decision can be very expensive. We propose a learning policy that adaptively selects the fleet allocation to learn the underlying expected operational cost function by incorporating the value of information. Specifically, we exploit the monotonicity of the expected operational cost in the number of allocated delivery vehicles in a region and extend the idea of knowledge gradient with discrete priors with resampling and regeneration (KGDP-R&R). Our numerical results demonstrate the effectiveness of KGDP-R&R against other learning policies as well as its managerial implications compared with heuristics in practice. The online appendix is available at https://doi.org/10.1287/trsc.2018.0861 .

Full Text