Abstract

Understanding how host load changes over time is instrumental in predicting the execution time of tasks or jobs, such as in dynamic load balancing and distributed soft real‐time systems. To improve this understanding, we collected week‐long, 1 Hz resolution traces of the Digital Unix 5 second exponential load average on over 35 different machines including production and research cluster machines, compute servers, and desktop workstations. Separate sets of traces were collected at two different times of the year. The traces capture all of the dynamic load information available to user‐level programs on these machines. We present a detailed statistical analysis of these traces here, including summary statistics, distributions, and time series analysis results. Two significant new results are that load is self‐similar and that it displays epochal behavior. All of the traces exhibit a high degree of self‐similarity with Hurst parameters ranging from 0.73 to 0.99, strongly biased toward the top of that range. The traces also display epochal behavior in that the local frequency content of the load signal remains quite stable for long periods of time (150–450 s mean) and changes abruptly at epoch boundaries. Despite these complex behaviors, we have found that relatively simple linear models are sufficient for short‐range host load prediction.

Highlights

  • The distributed computing environments to which most users have access consist of a collection of loosely interconnected hosts running vendor operating systems

  • Autocorrelation is a function of ∆, and in Fig. 7(b) we show the results for 0 ∆ 600

  • We examined each of the load traces for selfsimilarity and estimated each one’s Hurst parameter

Read more

Summary

Introduction

The distributed computing environments to which most users have access consist of a collection of loosely interconnected hosts running vendor operating systems. Their Hurst parameters range from 0.73 to 0.99, with a strong bias toward the top of that range This tells us that load varies in complex ways on all time scales and is long term dependent. Variance decays with increasing interval length m and Hurst parameter H as m2H−2 This is m−1.0 for signals without long range dependence and m−0.54 to m−0.02 for the range of H we measured. The local frequency content of the load signal remains quite stable for long periods of time (150–450 s mean) and changes abruptly at the boundaries of such epochs This suggests that the problem of predicting load may be able to be decomposed into a sequence of smaller subproblems. We evaluated linear models for predicting host load using the traces, finding that relatively simple autoregressive models are sufficient for short range host load prediction [7]

Measurement methodology
Statistical analysis
Summary statistics
Self-similarity
Epochal behavior
Findings
Conclusions and future work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call