Abstract

Cloud computing is an optimistic technology that leverages the computing resources to offer globally better and more efficient services than the collection of individual use of internet resources. Due to the heterogeneous and high dynamic nature of resources, failure during resource allocation is a key risk in cloud. Such resource failures lead to delay in tasks execution and have adverse impacts in achieving quality of service (QoS). This paper proposes an effective and adaptive fault tolerant scheduling approach in an effort to facilitate error free task scheduling. The proposed method considers the most impactful parameters such as failure rate and current workload of the resources for optimal QoS. The suggested approach is validated using the CloudSim toolkit based on the commonly used metrics including the resource utilization, average execution time, makespan, throughput, and success rate. Empirical results prove that the suggested approach is more efficient than the benchmark techniques in terms of load balancing and fault tolerance.

Highlights

  • Cloud computing provides cost-effective computing resources usually with more reliable performance by sharing a large amount of resources with many users who consume the resources at different times

  • This paper proposes an effective and adaptive fault tolerant scheduling approach in an effort to facilitate error free task scheduling

  • This paper proposes an Adaptive Fault Tolerant Resource Allocation (AFTRA) approach, as a proactive fault tolerance measure

Read more

Summary

INTRODUCTION

Cloud computing provides cost-effective computing resources usually with more reliable performance by sharing a large amount of resources with many users who consume the resources at different times Service execution fault occurs if the user or application uses it when the service time of the resources expires These faults usually result in one of the major failures occurred in cloud environments, including hardware failure, virtual machine failure and application failure (Garraghan et al, 2014). Fault tolerance requires the development of a blueprint for continuing services even if a few resources in the cloud are down or inaccessible It prevents network device or computer resources from failures due to any faults in the execution (Parvez, Robel, Rouf, Podder, & Bharati, 2019). Fault tolerance is needed in order to provide assurance for availability and reliability of critical resources as well as task execution It includes the techniques necessary for robustness, failure recovery and improving overall performance.

Proactive Fault Tolerant Resource Allocation Methods
Reactive Fault Tolerant Resource Allocation Methods
PROPOSED FAULT TOLERANT RESOURCE ALLOCATION
SIMULATION SETUP AND PERFORMANCE EVALUATIONS
Performance Metrics
Simulation Environment 1
Simulation Environment 2
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call