Abstract

Recent advances in Deep Reinforcement Learning (DRL) demonstrates the potential for solving Combinatorial Optimization (CO) problems. DRL shows advantages over traditional methods both on scalability and computation efficiency. However, the DRL problems transformed from CO problems usually have a huge state space, and the main challenge of solving them has changed from high computation complexity to high sample complexity.Credit assignment determines the contribution of each internal decision to the final success or failure, and it has been shown to be effective in reducing the sample complexity of the training process. In this paper, we resort to a model-based reinforcement learning method to assign credits for model-free DRL methods. Since heuristic methods plays an important role on state-of-the-art solutions for CO problems, we propose using a model to represent those heuristic knowledge and derive the credit assignment from the model. This model-based credit assignment can facilitate the model-free DRL to perform a more effective exploration, and the data collected by the model-free DRL refines the model continuously as the training progresses. Extensive experiments on various CO problems with different settings show that our framework outperforms previous state-of-the-art methods on performance and training efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call