Abstract
Many scientific applications feature large-scale workflows consisting of computing modules that must be strategically deployed and executed in distributed environments. The end-to-end performance of such scientific workflows depends on both the mapping scheme that determines module assignment, and the scheduling policy that determines resource allocation if multiple modules are mapped to the same node. These two aspects of workflow optimization are traditionally treated as two separated topics, and the interactions between them have not been fully explored by any existing efforts. As the scale of scientific workflows and the complexity of network environments rapidly increase, each individual aspect of performance optimization alone can only meet with limited success. We conduct an in-depth investigation into workflow execution dynamics in distributed environments and formulate a generic problem that considers both workflow mapping and task scheduling to minimize the end-to-end delay of workflows. We propose an integrated solution, referred to as Mapping and Scheduling Interaction (MSI), to improve the workflow performance. The efficacy of MSI is illustrated by both extensive simulations and proof-of-concept experiments using real-life scientific workflows for climate modeling on a PC cluster.
Accepted Version (
Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have