Abstract

Resource allocation problems often manifest as online decision-making tasks where the proper allocation strategy depends on the understanding of the allocation environment and resource workload. Most existing resource allocation methods are based on meticulously designed heuristics which ignore the patterns of incoming tasks, so the dynamics of incoming tasks cannot be properly handled. To address this problem, we mine the task patterns from the large volume of historical allocation data and propose a reinforcement learning model termed IRDA to learn the allocation strategy in an incremental way. We observe that historical allocation data is usually generated from the daily repeated operations, which is not independent and identically distributed. Training with the partial dataset can make the strategy converged already, thereby wasting the remaining data. To improve the learning efficiency, we partition the whole historical big dataset into multi-batch datasets, which forces the agent continuously to ‘explore’ and learn on the distinct state spaces. We apply the proposed method to handle baggage carousel allocation at Hong Kong International Airport (HKIA). The experimental results show that IRDA improves the baggage carousel resource utilization by around 51.86% compared to the current baggage carousel allocation system at HKIA.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call