Temporal Fusion Pointer network-based Reinforcement Learning algorithm for Multi-Objective Workflow Scheduling in the cloud

Binyang Wang,Yuanqing Xia,Huifang Li,Zhiwei Lin

doi:10.1109/ijcnn48605.2020.9207151

Abstract

Cloud computing is emerging as a deployment promising environment for hosting exponentially increasing scientific and social media applications, but how to manage and execute these applications efficiently depends mainly on workflow scheduling. However, scheduling workflows in the cloud is an NP-hard problem and its existing solutions have certain limitations when applied to real-world scenarios. In this paper, a Temporal Fusion Pointer network-based Reinforcement Learning algorithm for multi-objective workflow scheduling (TFP-RL) is proposed. Through adopting reinforcement learning, our algorithm can discover its heuristics over time by continuous learning according to the rewards resulting from good scheduling solutions. To make more comprehensive scheduling decisions as the influence of historical actions, a novel temporal fusion pointer network (TFP) is designed for the reinforcement learning agent, which can improve the quality of our resulting solutions and the ability of our algorithm in dealing with versatile workflow applications. To decrease convergence time, we train the proposed TFP-RL model independently by the Asynchronous Advantage Actor-Critic method and use its resulting model for scheduling workflows. Finally, under a multi-agent reinforcement learning framework, a Pareto dominance-oriented criterion for reasonable action selection is established for a multi-objective optimization scenario. We first train our TFP-RL model by taking randomly generated workflows as inputs to validate its effectiveness in scheduling, then compare our trained model with other existing scheduling approaches through practical compute- and data-intensive workflows. Experimental results demonstrate that our proposed algorithm outperforms the benchmarking ones in terms of different metrics.

Full Text