Towards Long-View Computing Load Balancing in Cluster Storage Systems

Guoxin Liu,Haiying Shen,Haoyu Wang

doi:10.1109/tpds.2016.2632713

Guoxin Liu, Haiying Shen + Show 1 more

Open Access

https://doi.org/10.1109/tpds.2016.2632713

Copy DOI

Abstract

In large-scale computing clusters, when the server storing a task's requested data does not have sufficient computing capacity for the task, current job schedulers either schedule the task to the closest server and transmit to it the requested data, or let the task wait until the server has sufficient computing capacity. The former solution generates network load while the latter solution increases task delay. To handle this problem, load balancing methods are needed to reduce the number of overloaded servers due to computing workloads. However, current load balancing methods do not aim to balance the computing load for the long term. Through trace analysis, we demonstrate the diversity of computing workloads of different tasks and the necessity of balancing the computing workloads among servers. Then, we propose a cost-efficient Computing load Aware and Long-View load balancing approach ( CALV ). CALV is novel in that it achieves long-term computing load balance by migrating out an overloaded server data blocks contributing more computing workloads when the server is more overloaded and contribute less computing workloads when the server is more underloaded at different epochs during a time period. Based upon the task schedules, we further propose a task reassignment algorithm that reassigns tasks from an overloaded server to other data servers of the tasks to make it non-overloaded before CALV is conducted. The above methods are for the tasks whose submission times and execution latencies can be predicted. To handle unexpected tasks or insufficiently accurate predictions, we propose a dynamic load balancing method, in which an overloaded server dynamically redirects tasks to other data servers of the tasks, or replicates the tasks’ requested data to other servers and redirects the tasks to those servers in order to become non-overloaded. Finally, we propose a proximity-aware tree based distributed load balancing method to reduce the reallocation cost and improve the scalability of CALV . Trace-driven experiments in simulation and a real computing cluster show that CALV outperforms other methods in terms of balancing the computing workloads and cost efficiency.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: Jun 1, 2017
Citations: 7	License type: publisher-specific, author manuscript

R Discovery Prime

R Discovery Prime

Towards Long-View Computing Load Balancing in Cluster Storage Systems

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Similar Papers

Computing load aware and long-view load balancing for cluster storage systems
Guoxin Liu ... Haoyu Wang
-
Guoxin Liu, et. al.Guoxin Liu ... Haoyu Wang
01 Oct 2015
01 Oct 2015

ПІДВИЩЕННЯ ЕФЕКТИВНОСТІ МАСШТАБУВАННЯ АРХІТЕКТУРИ ХМАРНИХ ДОДАТКІВ
Oleh Streltsov ... Mykhailo Katrichenko
ELECTRICAL AND COMPUTER SYSTEMS | VOL. -
Oleh Streltsov, et. al.Oleh Streltsov ... Mykhailo Katrichenko
01 Jan 2023
ELECTRICAL AND COMPUTER SYSTEMS | VOL. -

ПІДВИЩЕННЯ ЕФЕКТИВНОСТІ МАСШТАБУВАННЯ АРХІТЕКТУРИ ХМАРНИХ ДОДАТКІВ
Oleh Streltsov ... Mykhailo Katrichenko
ELECTRICAL AND COMPUTER SYSTEMS | VOL. 38
Oleh Streltsov, et. al.Oleh Streltsov ... Mykhailo Katrichenko
01 Jan 2023
ELECTRICAL AND COMPUTER SYSTEMS | VOL. 38

Joint Optimisation of Load Balancing and Handover for Hybrid LiFi and WiFi Networks
Xiping Wu ... Harald Haas
-
Xiping Wu, et. al.Xiping Wu ... Harald Haas
01 Mar 2017
01 Mar 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards Long-View Computing Load Balancing in Cluster Storage Systems

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems