Abstract

Heterogeneous distributed systems are widely deployed for executing computationally intensive parallel applications with diverse computing needs. Such environments require effective scheduling strategies that take into account both algorithmic and architectural characteristics. Unfortunately, most of the scheduling algorithms developed for such systems rely on a simple platform model where communication contention is not taken into account. In addition, it is generally assumed that processors are completely safe. To schedule precedence graphs in a more realistic framework, we introduce first an efficient fault-tolerant scheduling algorithm that is both contention-aware and capable of supporting an arbitrary number of fail-silent (fail-stop) processor failures. Next, we derive a more complex heuristic that departs from the main principle of the first algorithm. Instead of considering a single task (one with highest priority) and assigning all its replicas to the currently best available resources, we consider a chunk of ready tasks, and assign all their replicas in the same decision making procedure. This leads to a better load balance of processors and communication links. We focus on a bi-criteria approach, where we aim at minimizing the total execution time, or latency, given a fixed number of failures supported in the system. Our algorithms have a low time complexity, and drastically reduce the number of additional communications induced by the replication mechanism. Experimental results fully demonstrate the usefulness of the proposed algorithms, which lead to efficient execution schemes while guaranteeing a prescribed level of fault-tolerance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.