Abstract

The container pre-marshalling problem (CPMP) aims to minimise the number of reshuffling moves, ultimately achieving an optimised stacking arrangement in each bay based on the priority of containers during the non-loading phase. Given the sequential decision nature, we formulated the CPMP as a Markov decision process (MDP) model to account for the specific state and action of the reshuffling process. To address the challenge that the relocated container may trigger a chain effect on the subsequent reshuffling moves, this paper develops an improved policy-based Monte Carlo tree search (P-MCTS) to solve the CPMP, where eight composite reshuffling rules and modified upper confidence bounds are employed in the selection phases, and a well-designed heuristic algorithm is utilised in the simulation phases. Meanwhile, considering the effectiveness of reinforcement learning methods for solving the MDP model, an improved Q-learning is proposed as the compared method. Numerical results show that the P-MCTS outperforms all compared methods in scenarios where all containers have different priorities and scenarios where containers can share the same priority.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call