Abstract

In recent years, more and more large-scale data processing and computing workflow applications run on heterogeneous clouds. Such cloud applications with precedence-constrained tasks are usually deadline-constrained and their scheduling is an essential problem faced by cloud providers. Moreover, minimizing the workflow execution cost based on cloud billing periods is also a complex and challenging problem for clouds. In realizing this, we first model the workflow applications as I/O Data-aware Directed Acyclic Graph (DDAG), according to clouds with global storage systems. Then, we mathematically state this deadline-constrained workflow scheduling problem with the goal of minimum execution financial cost. We also prove that the time complexity of this problem is NP-hard by deducing from a multidimensional multiple-choice knapsack problem. Third, we propose a heuristic cost-efficient task scheduling strategy called CETSS, which includes workflow DDAG model building, task subdeadline initialization, greedy workflow scheduling algorithm, and task adjusting method. The greedy workflow scheduling algorithm mainly consists of dynamical task renting billing period sharing method and unscheduled task subdeadline relax technique. We perform rigorous simulations on some synthetic randomly generated applications and real-world applications, such as Epigenomics, CyberShake, and LIGO. The experimental results clearly demonstrate that our proposed heuristic CETSS outperforms the existing algorithms and can effective save the total workflow execution cost. In particular, CETSS is very suitable for large workflow applications.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call