Abstract

In job scheduling, the concept of malleability has been explored since many years ago. Research shows that malleability improves system performance, but its utilization in HPC never became widespread. The causes are the difficulty in developing malleable applications, and the lack of support and integration of the different layers of the HPC software stack. However, in the last years, malleability in job scheduling is becoming more critical because of the increasing complexity of hardware and workloads. In this context, using nodes in an exclusive mode is not always the most efficient solution as in traditional HPC jobs, where applications were highly tuned for static allocations, but offering zero flexibility to dynamic executions. This paper proposes a new holistic, dynamic job scheduling policy, Slowdown Driven (SD-Policy), which exploits the malleability of applications as the key technology to reduce the average slowdown and response time of jobs. SD-Policy is based on backfill and node sharing. It applies malleability to running jobs to make room for jobs that will run with a reduced set of resources, only when the estimated slowdown improves over the static approach. We implemented SD-Policy in SLURM and evaluated it in a real production environment, and with a simulator using workloads of up to 198K jobs. Results show better resource utilization with the reduction of makespan, response time, slowdown, and energy consumption, up to respectively 7%, 50%, 70%, and 6%, for the evaluated workloads.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.