Abstract

This paper provides a survey of the state-of-the-art workflow scheduling algorithms with the assumption of cloud computing being used as the underlying compute infrastructure in support of large-scale scientific workflows involving big data. The survey also reviews a few selected representative scientific workflow systems in light of usability, performance, popularity, and other prominent features. In contrast to existing related surveys, which most try to be comprehensive in coverage and inevitably fall short in the depth of their coverage on workflow scheduling, this survey puts an emphasis on the two dominant factors in workflow scheduling, the makespan and the monetary cost of workflow execution, resulted in a useful taxonomy of workflow scheduling algorithms as an additional contribution. This survey tries to maintain a good balance between width and depth in its coverage - after a broad review, it spotlights on selected top ten representative scheduling algorithms and top five workflow management systems leveraging cloud infrastructure with an emphasis on support for big data scientific workflows.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call