A Push/Pull-Based System Scheduling Framework for Large-Scale Cluster

Bo Wang,Zhiguang Chen,Nong Xiao

doi:10.1109/bigdia51454.2020.00058

Abstract

The integration of high-performance computing jobs and big data processing jobs has developed into a significant trend in the industry. Since distributed big data clusters are slightly inferior to supercomputers in terms of performance, more and more big data jobs are operated on supercomputers, such as Tianhe-2. These two types of jobs have different characteristics and resource requirements, which makes it difficult for the job scheduler currently used in supercomputers to make a perfect connection between jobs and resources. To this end, this paper proposes a resource scheduling framework based on the combination of Push and Pull. According to the node resource usage and job resource request amount, two different scheduling strategies, Push or Pull, can be implemented for the job. Among them, Push means that the management node dispatches a task to the work node, and Pull means that the work node applies to the management node a task to execute. According to the experiments, compared to the Push-based scheduling mode, our Push/Pull-based scheduling mode has about 3 times the job throughput without affecting the scalability and increases the resource utilization 20.4%.

Full Text