Predicting queue wait time probabilities for multi-scale computing.

Vytautas Jancauskas,Tomasz Piontek,Bartosz Bosak,Piotr Kopta

doi:10.1098/rsta.2018.0151

Abstract

We describe a method for queue wait time prediction in supercomputing clusters. It was designed for use as a part of multi-criteria brokering mechanisms for resource selection in a multi-site High Performance Computing environment. The aim is to incorporate the time jobs stay queued in the scheduling system into the selection criteria. Our method can also be used by the end users to estimate the time to completion of their computing jobs. It uses historical data about the particular system to make predictions. It returns a list of probability estimates of the form (ti, pi), where pi is the probability that the job will start before time ti. Times ti can be chosen more or less freely when deploying the system. Compared to regression methods that only return a single number as a queue wait time estimate (usually without error bars) our prediction system provides more useful information. The probability estimates are calculated using the Bayes theorem with the naive assumption that the attributes describing the jobs are independent. They are further calibrated to make sure they are as accurate as possible, given available data. We describe our service and its REST API and the underlying methods in detail and provide empirical evidence in support of the method's efficacy.This article is part of the theme issue ‘Multiscale modelling, simulation and computing: from the desktop to the exascale’.

Highlights

The issue of queue wait times comes up in many situations in High Performance Computing (HPC)
We describe a method for queue wait time prediction in supercomputing clusters
Underused systems are unlikely to be interesting in terms of queue wait time predictions

Summary

Introduction

The issue of queue wait times comes up in many situations in High Performance Computing (HPC). The multi-criteria approach to resource selection and the practical use of queue wait time prediction needed to estimate the time to finish was the initial inspiration and the main motivation for the work described in this paper. (i) The Pattern Optimization Service, based on the knowledge of the application itself and the static information about the infrastructure, generates a list of assignment plans determining the search space for the optimal allocation of resources to computational kernels (parts of the multi-scale application) This list is submitted to the QCG-Broker service as a part of the job description with both the requirements for computing resources and definition of optimization criteria and limits. Gao et al [13] use a genetic algorithm to optimize job scheduling, their approach does not apply to us due to the fact that we need to estimate job queue wait time probabilities.

Naive Bayes for queue wait time prediction

Results

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences	Publication Date: Feb 18, 2019
Citations: 7	License type: cc-by

R Discovery Prime

R Discovery Prime

Predicting queue wait time probabilities for multi-scale computing.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences

Lead the way for us

Similar Papers

Trace-based evaluation of job runtime and queue wait time predictions in grids
Ozan Sonmez ... Nezih Yigitbasi
-
Ozan Sonmez, et. al.Ozan Sonmez ... Nezih Yigitbasi
11 Jun 2009
11 Jun 2009

Queue Waiting Time Prediction for Large-scale High-performance Computing System
Ju-Won Park
-
Ju-Won ParkJu-Won Park
01 Jul 2019
01 Jul 2019

Performance Evaluation in Grid Computing: A Modeling and Prediction Perspective
Hui Li
-
Hui LiHui Li
01 May 2007
01 May 2007

Estimation of probabilities of three kinds of petrologic hypotheses with Bayes theorem
James Nicholls
Mathematical geosciences | VOL. 30
James NichollsJames Nicholls
01 Jan 1998
Mathematical geosciences | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Predicting queue wait time probabilities for multi-scale computing.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences