Abstract

Data in a data center are stored dispersively. The data-oriented task computing disperses big data analysis tasks to different computing nodes. The extensive use of graphics processing unit (GPU) makes it urgent and important to study how to reasonably assign heterogeneous resources to different computing frameworks. We investigate the existing big data computing framework and the GPU computing. Based on the existing cluster resource management model and the GPU management model, we propose a hybrid heterogeneous resource management model that combines CPU resources with GPU resources. The computing nodes manage local resources and implement tasks; the resource management center concertedly manage various computing frameworks. We design and implement a hybrid domain resource sharing and allocation algorithm, which allocates the hybrid domain resources to computing frameworks according to the coordinated use of them so as to fairly share the hybrid domain resources among various computing frameworks and prevent the CPU from too many tasks but the GPU or CPU from resource "hunger". The experimental results show that the allocation algorithm can increase the use of heterogeneous resources and the number of completed tasks by around 15%.

Highlights

  • 集群中的资源包括 CPU 核数 VC,内存大小 VM, GPU 显存大小 VG,集群全部资源表示为 A(VC,VM, VG) ,已经使用的资源表示为 R( VC,VM,VG) ,可用资 源表示为 A - R。

  • Based on the existing cluster resource management model and the GPU management model, we propose a hybrid heterogeneous resource management model that combines CPU resources with GPU resources

  • We design and im⁃ plement a hybrid domain resource sharing and allocation algorithm, which allocates the hybrid domain resources to computing frameworks according to the coordinated use of them so as to fairly share the hybrid domain resources a⁃ mong various computing frameworks and prevent the CPU from too many tasks but the GPU or CPU from resource “ hunger”

Read more

Summary

Introduction

集群中的资源包括 CPU 核数 VC,内存大小 VM, GPU 显存大小 VG,集群全部资源表示为 A(VC,VM, VG) ,已经使用的资源表示为 R( VC,VM,VG) ,可用资 源表示为 A - R。 测试时,使用了 2 个计算框架 JOB1 和 JOB2, JOB1 是字符串统计程序,其每个任务使用< 2 core, 2 GB内存>,其主资源为 CPU;JOB2 是完全 GPU 实 现的矩阵乘法,其每个任务使用,其主资源为 GPU。 资源,每种主资源占比相同,所以其完成的任务数大 致相当。 从总任务来看,虽然 FIFO 以及混合 DRF 中完成的 CPU 与 GPU 任务数大致相同,但是由于 混合 DRF 考虑各自的主资源,使得每种主资源都得 到最大的分配,故混合 DRF 中,总的完成任务数明 显比 FIFO 中总任务数多出许多。 这说明混合 DRF,使得不同类型的任务最大化自己的主资源,因 而系统的总体资源得到更高效的利用。 3.3 粗细粒度 GPU 资源使用率的比较

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call