Performance evaluation of counter selection techniques to detect discontinuity in large-scale-systems

Haroon Malik,Elhadi M Shakshuki

doi:10.1007/s12652-017-0525-1

Abstract

Cloud providers and data centres rely heavily on forecasts to accurately predict future workload. This information helps them in appropriate virtualization and cost-effective provisioning of the infrastructure. The accuracy of a forecast greatly depends upon the merit of performance data fed to the underlying algorithms. One of the fundamental problems faced by analysts in preparing data for use in forecasting is the timely identification of data discontinuities. A discontinuity is an abrupt change in a time-series pattern of a performance counter that persists but does not recur. We used a supervised and an unsupervised techniques to automatically identify the important performance counters that are likely indicators of discontinuities within performance data. We compared the performance of our approaches by conducting a case study on the performance data obtained from a large scale cloud provider as well as on open source benchmarks systems. The supervised counter selection approach has superior results in terms of unsupervised approach but bears and overhead of manual labelling of the performance data.

Full Text