Abstract
In systems consistingof multiple clusters of processors interconnected by relatively slow connections such as our Distributed ASCI1 Supercomputer (DAS), jobs may request co-allocation, i.e., the simultaneous allocation of processors in different clusters. The performance of co-allocation may be severely impacted by the slowintercluster connections, and by the types of job requests. We distinguish different job request types ranging from ordered requests that specify the numbers of processors needed in each of the clusters, to flexible requests that only specify a total. We simulate multicluster systems with the FCFS policy-- and with two policies for placinga flexible request, one tries to balance cluster loads and one tries to fill clusters completely--to determine the response times under workloads consistingof a single or of different request types for different communication speeds across the intercluster connections. In addition to a synthetic workload, we also consider a workload derived from measurements of a real application on the DAS. We find that the communication speed difference has a severe impact on response times, that a relatively small amount of capacity is lost due to communication, and that for a mix of request types, the performance is determined not only by the separate behaviours of the different types of requests, but also by the way in which they interact.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.