Abstract

Job management system (JMS) is an important part of any supercomputer. JMS creates a schedule for launching jobs of different users. Actual job management systems are complex software systems with a number of settings. These settings have a significant impact on various JMS metrics, such as supercomputer resources utilization, mean waiting time of a job in queue, and others. Various JMS simulators are widely used to study the influence of JMS settings or modifications, new scheduling algorithms, jobs input stream parameters or available computing resources for JMS efficiency metrics. The article presents the comparative analysis results of the actual JMS simulators (Alea, ScSF, Batsim, AccaSim, Slurm simulator) and their application areas. The authors consider new ways to use the JMS simulator as a scientific service for researchers. With such a service, the researchers are able to study various hypotheses about JMS efficiency, algorithms or parameters. This gives the folowing: (1) research is performed on the service side around the clock, (2) the simulator accuracy or adequacy is provided by the service, (3) the research results reproducibility is ensured, and the simulator-as-a-service becomes a single entry point for the researchers.

Highlights

  • J OB MANAGEMENT system (JMS) is an essential software for multi-user high performance computing [1]

  • JMS handles a queue of user jobs, determines job launch order, allocates computing nodes for launched jobs, controls job termination and checks that nodes are freed after job termination

  • We suggest a new approach for experimental study of JMS based on once developed and publicly served JMS model used by researchers

Read more

Summary

INTRODUCTION

J OB MANAGEMENT system (JMS) is an essential software for multi-user high performance computing [1]. Most of JMSs provide worst-case job launch time to a user (if no nodes are broken) when every job in queue ends at its limit. The forecast could provide to a user more precise launch time estimation Such forecast is especially important for geographically distributed supercomputer systems. Note the integration of geographically distributed supercomputer resources is a steady high-performance computing trend [3]. The launch location is a supercomputer in the distributed digital platform, where the job will be executed. The forecast could be used to schedule a global job queue for the distributed supercomputer system [5]. Job from the global queue could be executed on less busy JMS which reduces resource imbalance This can be achieved by modelling of the management system in order to predict the launch time and location for each job. Most popular JMS model implementation is a simulator, we would use the words «model» and «simulator» as synonyms

RELATED WORK
JMS MODEL AS A SCIENTIFIC SERVICE
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.