Abstract

In large-scale computing clusters, when the server storing a task's requested data does not have sufficient computing capacity for the task, current job schedulers either schedule the task to the closest server and transmit to it the requested data, or let the task wait until the server has sufficient computing capacity. The former solution generates network load while the latter solution increases task delay. To handle this problem, load balancing methods are needed to reduce the number of overloaded servers due to computing workloads. However, current load balancing methods do not aim to balance the computing load for the long term. Through trace analysis, we demonstrate the diversity of computing workloads of different tasks and the necessity of balancing the computing workloads among servers. Then, we propose a cost-efficient Computing load Aware and Long-View load balancing approach ( CALV ). CALV is novel in that it achieves long-term computing load balance by migrating out an overloaded server data blocks contributing more computing workloads when the server is more overloaded and contribute less computing workloads when the server is more underloaded at different epochs during a time period. Based upon the task schedules, we further propose a task reassignment algorithm that reassigns tasks from an overloaded server to other data servers of the tasks to make it non-overloaded before CALV is conducted. The above methods are for the tasks whose submission times and execution latencies can be predicted. To handle unexpected tasks or insufficiently accurate predictions, we propose a dynamic load balancing method, in which an overloaded server dynamically redirects tasks to other data servers of the tasks, or replicates the tasks’ requested data to other servers and redirects the tasks to those servers in order to become non-overloaded. Finally, we propose a proximity-aware tree based distributed load balancing method to reduce the reallocation cost and improve the scalability of CALV . Trace-driven experiments in simulation and a real computing cluster show that CALV outperforms other methods in terms of balancing the computing workloads and cost efficiency.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.