Abstract

AbstractIn high‐performance computing (HPC)textitworkloads (i.e. the set of computations to be completed), the same computational workflow of jobs (e.g. a Pipeline, a Fork&Join, or a Lattice graph) may be applied to different input files and parameters. Each of these workflow instances has the same workflow shape, but accesses (possibly) separate input, intermediate, and output files. Therefore, the selective isolation of each workflow instance can be important for maximizing scheduling flexibility and performance. However, in practice, realizing this benefit is not obvious due to a variety of problems and constraints. For example, the unmediated interaction of different workflow instances can lead to a problem of filename conflicts between concurrent workflow instances overwriting common files, which, for a control‐flow driven batch scheduler, may result in either unsafe computation of the multiple instances in the same sub‐directory or storage overheads when multiple directories are used. We propose a novel approach of selectively coupling and integrating job schedulers and file systems, known as a Workflow‐aware File System (WaFS), with two major benefits. First, separate namespaces can be constructed on a per‐instance basis to maximize the concurrency of workflow instances, despite filename conflicts, while minimizing storage overhead. Second, exploiting inferred dataflow information, trade‐offs can be made between makespan and storage overhead while maintaining correctness. Through a simulation‐based study, we have shown the potential benefits of WaFS to job concurrency and we have characterized the trade‐offs that can be made between storage overhead and performance. New scheduling policies, Versioned Namespace (VNS), Overwrite‐Safe Concurrency (OSC) and hybrids, are made possible by WaFS, with different advantages and disadvantages. Copyright © 2011 John Wiley & Sons, Ltd.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call