Fine-Grained Allocation Algorithm for Sharing Heterogeneous Resources in Data Center

Xiaochun Tang,Xuefeng Fan,Ying Fu

doi:10.1051/jnwpu/20203830589

Xiaochun Tang, Xuefeng Fan + Show 1 more

Open Access

PDF Available

https://doi.org/10.1051/jnwpu/20203830589

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Data in a data center are stored dispersively. The data-oriented task computing disperses big data analysis tasks to different computing nodes. The extensive use of graphics processing unit (GPU) makes it urgent and important to study how to reasonably assign heterogeneous resources to different computing frameworks. We investigate the existing big data computing framework and the GPU computing. Based on the existing cluster resource management model and the GPU management model, we propose a hybrid heterogeneous resource management model that combines CPU resources with GPU resources. The computing nodes manage local resources and implement tasks; the resource management center concertedly manage various computing frameworks. We design and implement a hybrid domain resource sharing and allocation algorithm, which allocates the hybrid domain resources to computing frameworks according to the coordinated use of them so as to fairly share the hybrid domain resources among various computing frameworks and prevent the CPU from too many tasks but the GPU or CPU from resource "hunger". The experimental results show that the allocation algorithm can increase the use of heterogeneous resources and the number of completed tasks by around 15%.

Highlights

集群中的资源包括 CPU 核数 VC,内存大小 VM, GPU 显存大小 VG,集群全部资源表示为 A(VC,VM, VG) ,已经使用的资源表示为 R( VC,VM,VG) ,可用资源表示为 A - R。
Based on the existing cluster resource management model and the GPU management model, we propose a hybrid heterogeneous resource management model that combines CPU resources with GPU resources
We design and im⁃ plement a hybrid domain resource sharing and allocation algorithm, which allocates the hybrid domain resources to computing frameworks according to the coordinated use of them so as to fairly share the hybrid domain resources a⁃ mong various computing frameworks and prevent the CPU from too many tasks but the GPU or CPU from resource “ hunger”

Summary

Introduction

集群中的资源包括 CPU 核数 VC,内存大小 VM, GPU 显存大小 VG,集群全部资源表示为 A(VC,VM, VG) ,已经使用的资源表示为 R( VC,VM,VG) ,可用资源表示为 A - R。测试时,使用了 2 个计算框架 JOB1 和 JOB2, JOB1 是字符串统计程序,其每个任务使用< 2 core, 2 GB内存>,其主资源为 CPU;JOB2 是完全 GPU 实现的矩阵乘法,其每个任务使用,其主资源为 GPU。资源,每种主资源占比相同,所以其完成的任务数大致相当。从总任务来看,虽然 FIFO 以及混合 DRF 中完成的 CPU 与 GPU 任务数大致相同,但是由于混合 DRF 考虑各自的主资源,使得每种主资源都得到最大的分配,故混合 DRF 中,总的完成任务数明显比 FIFO 中总任务数多出许多。这说明混合 DRF,使得不同类型的任务最大化自己的主资源,因而系统的总体资源得到更高效的利用。 3.3 粗细粒度 GPU 资源使用率的比较

Results

Conclusion