Abstract

Supercomputer centers often deploy large-scale computing systems together with an associated data analysis or visualization system. In this paper, we propose a co scheduling mechanism, providing the ability to coordinate execution between jobs on different systems. The mechanism is built on top of a lightweight protocol for coordination between policy domains without manual intervention. We have evaluated this system using real job traces from Intrepid and Eureka, the production Blue Gene/P and data analysis systems, respectively, deployed at Argonne National Laboratory. Our experimental results quantify the costs of co scheduling and demonstrate that co scheduling can be achieved with limited impact on system performance under varying workloads.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call