Abstract

Cloud computing is a fully fledged, matured and flexible computing paradigm that provides services to scientific and business applications in a subscription-based environment. Scientific applications such as Montage and CyberShake are organized scientific workflows with data and compute-intensive tasks and also have some special characteristics. These characteristics include the tasks of scientific workflows that are executed in terms of integration, disintegration, pipeline, and parallelism, and thus require special attention to task management and data-oriented resource scheduling and management. The tasks executed during pipeline are considered as bottleneck executions, the failure of which result in the wholly futile execution, which requires a fault-tolerant-aware execution. The tasks executed during parallelism require similar instances of cloud resources, and thus, cluster-based execution may upgrade the system performance in terms of make-span and execution cost. Therefore, this research work presents a cluster-based, fault-tolerant and data-intensive (CFD) scheduling for scientific applications in cloud environments. The CFD strategy addresses the data intensiveness of tasks of scientific workflows with cluster-based, fault-tolerant mechanisms. The Montage scientific workflow is considered as a simulation and the results of the CFD strategy were compared with three well-known heuristic scheduling policies: (a) MCT, (b) Max-min, and (c) Min-min. The simulation results showed that the CFD strategy reduced the make-span by 14.28%, 20.37%, and 11.77%, respectively, as compared with the existing three policies. Similarly, the CFD reduces the execution cost by 1.27%, 5.3%, and 2.21%, respectively, as compared with the existing three policies. In case of the CFD strategy, the SLA is not violated with regard to time and cost constraints, whereas it is violated by the existing policies numerous times.

Highlights

  • Cloud computing is a distributed and large-scale computing environment

  • The proposed CFD strategy is implemented in WorkflowSim to simulate a Montage [9,14,20,32] scientific workflow, which is one of the real-time scientific applications belonging to the field of astronomy

  • The proposed CFD strategy provides a detailed process from scientific data submission to the generation of results with component-based scenarios

Read more

Summary

Introduction

Cloud computing is a distributed and large-scale computing environment. It provides a pool of virtualized and dynamic computing services [1]. [24,25] All these challenges lead to the need for effective and well-defined workflows scheduling strategy with cluster-based, fault-tolerant mechanisms. A cluster-based, fault-tolerant and data-intensive (CFD) resource scheduling and management strategy for scientific applications in cloud computing is proposed. A cluster-based, fault-tolerant and data-intensive (CFD) resource scheduling and management strategy for scientific applications in a cloud environment is proposed in this research work. In order to show the efficiency of the CFD strategy, the Montage [14] scientific workflow is executed and compares the simulation results with three well-known heuristic scheduling policies: (a) “Minimum Completion Time” (MCT) [32], (b) “Max-min” [22], and (c) “Min-min” [22] scheduling policies. The remainder of the paper is organized as follows: Section 2 presents the related work; Section 3 presents the System Model and Design; Section 4 provides in detail the experiments, results and discussions; and Section 5 concludes the work

Related Work
Limitations
System Design and Model
Application Interface
Workflow Admission
1: Workflow
Workflow
5: Workflow
Performance Evaluation Parameters
Discussion
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call