Abstract
Efficient scheduling among simultaneous simulation jobs is of critical importance in the allocation of limited computing and I/O resources. The difficulty of predicting when a job is completed can cause nontrivial problems for system administrators and users e.g., squandered resources, long waiting times, and simulation plan delays. To alleviate these problems, we propose a novel simulation runtime estimation scheme termed CLUTCH , which employs a well-orchestrated ensemble of clustering, classification, and regression techniques. The proposed scheme trains a runtime estimation model through a series of steps: ( ${i}$ ) grouping past simulation provenance records by clustering, ( ii ) labeling each of the grouped records by classification, and ( iii ) performing regression on the execution times in each group. Given a simulation and its external arguments, the trained model predicts the simulation’s runtime with high accuracy in a black box fashion, using only basic external arguments without needing extra information. We additionally propose two optimization algorithms which significantly reduce training overhead without sacrificing estimation quality. In the experiment with real datasets, our model achieved approximately a 14.2% growth in estimation accuracy, compared to the most recent state-of-the-art method; with our optimizations applied, the model was trained 16 times faster while still retaining accuracy.
Highlights
Runtime estimation has long been an important task for black box-based online simulation platform services [1]
1) Evaluation Environment Our proposed scheme CLUTCH was developed in R [36] and the source code is currently available in a GitHub repository1
The first metric is the Mean Absolute Percentage Error (MAPE) [38] method, the most basic method that can be considered for comparison with other competitors
Summary
Runtime estimation has long been an important task for black box-based online simulation platform services [1]. The main concerns are that often many simulations accompany high-performance computing (HPC) and storage resources which require very high execution cost in time, sometimes reaching up to months Such long execution times can lead to a variety of issues, such as (i) leaving users to sit and wait with no information of when their simulation will end; (ii) unexpectedly delaying simulation schedules; and (iii) wasting limited online simulation resources, occasionally caused by an infinite loop initiated by a wrong combination of simulation input values. To address these aforementioned concerns, this paper proposes a CLUsTering-based sCHeme for estimating simulation execution time, which we call CLUTCH.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.