Abstract

We present WaFS, a user-level file system, and a related scheduling algorithm for scientific workflow computation in the cloud. WaFS’s primary design goal is to automatically detect and gather the explicit and implicit data dependencies between workflow jobs, rather than high-performance file access. Using WaFS’s data, a workflow scheduler can either make effective cost-performance tradeoffs or improve storage utilization. Proper resource provisioning and storage utilization on pay-as-you-go clouds can be more cost effective than the uses of resources in traditional HPC systems. WaFS and the scheduler controls the number of concurrent workflow instances at runtime so that the storage is well used, while the total makespan (i.e., turnaround time for a workload) is not severely compromised. We describe the design and implementation of WaFS and the new workflow scheduling algorithm based on our previous work. We present empirical evidence of the acceptable overheads of our prototype WaFS and describe a simulation-based study, using representative workflows, to show the makespan benefits of our WaFS-enabled scheduling algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.