Hyper-heuristics have a high level of generality and adaptability, allowing them to effectively solve a wide range of complex optimization problems. With reinforcement learning, hyper-heuristics can use experience and knowledge gained to tackle unforeseen problems, allowing hyper-heuristics to adapt and improve over time. Our paper proposes a framework for using policy-based reinforcement learning to improve the performance of hyper-heuristics. The framework trains hyper-heuristic agents to select the best generalized constructive low-level heuristics to solve combinatorial optimization problems. The framework evaluation was performed using three benchmarking problems: traveling salesman, capacitated vehicle routing, and bin packing problems. The results showed that the proposed framework can outperform existing meta-heuristic and hyper-heuristic-based algorithms for all large problem instances in all problem domains. The proposed framework was also evaluated by applying it to a cost optimization problem for workflow scheduling on a hybrid cloud with a deadline constraint. Eight agents were trained on medium-sized workflows with two deadlines and tested against traditional meta-heuristic and hyper-heuristic methods to solve smaller and larger workflows with unforeseen deadlines. Four workflow applications, three workflow sizes, and three deadlines were used in the evaluation. The results showed that our proposed framework provided significantly better solutions of up to 98% for benchmarking problems, and up to 22% for cost optimization in workflow scheduling. Moreover, trained with small problem instances, the framework performed well for unforeseen larger problem instances implying its generalization. The proposed framework, thus, has the potential to improve both the generality and performance of solving large combinatorial optimization problems.
Read full abstract