Abstract

Computational grid is a network of loosely coupled, heterogeneous and geographically-dispersed computers acting together to perform a large compute-intensive job. In this article, we focus on the existing approaches to grid scheduling, load balancing and fault-tolerance problems. Although grid scheduling, load balancing and fault tolerance are active research areas in grid computing, these areas have largely been and continue to be developed independent of one another each focusing on different aspects of computing. Hence, in this survey, we hope to show that robust applications that can provide efficient results can be designed by collectively considering these areas. To this end, we first provide an introduction to the motivation, grid scheduling, load balancing and fault tolerance concepts of grid computing and discuss the works that have provided significant contributions to each of these areas since its inception until 2013. We discuss their advantages, disadvantages and analyze their suitability for usage in a dynamic grid environment. We conclude that, while important advancements have been made in each of these areas individually, high performance approaches that cumulatively consider these areas still remain to be explored. We also discuss the research work that is missing and what we believe the community should be considering. To the best of our knowledge, no such survey has been conducted in the literature up to now.

Highlights

  • Due to the evolution of science and engineering, problems in these fields have become complicated

  • A distributed computing environment (DCE) is predictable: The availability of resources is based on the fact that the reservation and processing speeds are static and known in advance

  • Optimum resource utilization: A load balancing algorithm should optimize the utilization of the resources by optimizing the time or cost related to these resources

Read more

Summary

Introduction

Due to the evolution of science and engineering, problems in these fields have become complicated. A computing grid is an amalgamation of hardware and software infrastructures from different locations that offer dependable, steady and cost-effective access to high-end computational capabilities [1]. They facilitate dynamic sharing, aggregation and selection of geographically distributed, independent computers at run-time based on their accessibility, performance and capability. Developments in scheduling research reflect movements from isolated multi-host scenarios to large scale infrastructures.

Load balancing
Load balancing bharacteristics
Fault tolerance
Grid scheduling techniques
Load balancing approaches
Fault tolerance methods
Discussions and Future Research Directions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call