Abstract

Scientific applications modeled as workflows can exhibit both task and data parallelism. Scheduling these workflows in a multi-cluster environment is challenging due to the large number of task mapping possibilities. Therefore, several heuristics have been proposed over the last years to address such a problem. A key limitation of existing heuristics for multi-cluster environments is that individual tasks are mapped onto single resources, which limits the resource options to reduce the time to the complete workflow executions. This paper introduces the Multi-Cluster Allocation-Heterogeneous Earliest Finish Time (MCA-HEFT) heuristic, which deploys single parallel tasks of a workflow into multiple clusters and schedules them accordingly. We evaluated MCA-HEFT against the Mixed-parallel Heterogeneous Earliest Finish Time (M-HEFT) heuristic, which is one of the most well-known workflow scheduling heuristics in literature. MCA-HEFT was able to produce make spans that were up to 42% shorter than those produced by M-HEFT, having only approximately 10% of tasks distributed on multiple clusters. Our experiments considered several metrics and parameters including critical path size, make span, number of clusters used to execute tasks, and the network impact when deploying the tasks in multiple clusters.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.