The Nurse Rostering Problem (NRP) aims to create an efficient and fair work schedule that balances both the needs of employees and the requirements of hospital operations. Traditional local search-based metaheuristic algorithms, such as adaptive neighborhood search (ANS) and variable neighborhood descent (VND), mainly focus on optimizing the current solution without considering potential long-term consequences, which may easily get stuck in local optima and limit the overall performance. Thus, we propose a multi-agent deep Q-network-based metaheuristic algorithm (MDQN-MA) for NRP to harness the strengths of various metaheuristics. Each agent encapsulates a metaheuristic algorithm, where its available actions represent different perspectives of the problem environment. By combining their strengths and various perspectives, these agents can work collaboratively to navigate and search for a broader range of potential solutions effectively. Furthermore, to improve the performance of an individual agent, we model its neighborhood search as a Markov Decision Process model and integrate a deep Q-network to consider long-term impacts for its neighborhood sequential decision-making. The experimental results clearly show that an individual agent in MDQN-MA can outperform ANS and VND, and multiple agents in MDQN-MA even perform better, achieving the best results among metaheuristic algorithms on the Second International Nurse Rostering Competition dataset.
Read full abstract