Abstract
Cloud computing is gaining tremendous momentum in both academia and industry. In this context, we define the term “Cloud Workflow” as the specification, execution and provenance tracking of large-scale scientific workflows, as well as the management of data and computing resources to support the execution of large-scale scientific workflows in the Cloud. In this paper, we first analyze the gap between these two complementary technologies, and what it means to bring Clouds and workflows together. Then, we present the key challenges in supporting Cloud workflows, and present our reference framework for scientific workflow management in the Cloud. Last we present our experience in integrating a scientific workflow management system—Swift into the Cloud. We discuss the performance of cluster provisioning within the OpenNebula Cloud platform, the Eucalyptus Cloud platform and Amazon EC2, and we demonstrate the capability and efficiency of the integration using a NASA MODIS image processing workflow and the Montage image mosaic workflow.Note to practitioners. Scientific workflow management plays a very important role for scientific computing and application coordination, while Cloud computing offers scalability and resource on-demand. We devise autonomous methods to integrate scientific workflow management systems with Cloud platforms and also provision resources for large scale workflows, which can facilitate scientists to easily manage their workflows in the Cloud, and take advantage of large scale Cloud resources. There are a few integration options and many challenges in the process, and the experience we gain will help researchers in migrating their workflow management systems and workflow applications into the Cloud.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have