Abstract
Workflow-based workloads usually consist of multiple instances of the same workflow, which are jobs with control or data dependencies, to carry out a well-defined scientific computation task, with each instance acting on its own input data. To maximize throughput performance, a high degree of concurrency is achievable by running multiple instances simultaneously. However, deadlock is a potential problem when storage is constrained. To address this problem, we design and evaluate a deadlock detection-based scheduling (DDS) algorithm that can achieve high performance by making the best use of the available storage resources. Our algorithm takes advantages of the dataflow information of the workflow to speculatively schedule each instance if the instant storage is sufficient for some constituent jobs, but not necessarily for the whole workflow instance. Whenever deadlock or a performance anomaly is detected, some selected in-progress workflow instances are required to be rollbacked to release storage for other blocked jobs. We develop a suite of strategies to select the victims and beneficiaries (instances or jobs) and evaluate their performance via a simulation-based study. Our results show that the DDS algorithm can adapt the job concurrency to the available storage resources and achieve higher performance than some deadlock avoidance methods in our synthetic and real workflow computations.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.