Abstract

Cloud providers and data centres rely heavily on forecasts to accurately predict future workload. This information helps them in appropriate virtualization and cost-effective provisioning of the infrastructure. The accuracy of a forecast greatly depends upon the merit of performance data fed to the underlying algorithms. One of the fundamental problems faced by analysts in preparing data for use in forecasting is the timely identification of data discontinuities. A discontinuity is an abrupt change in a time-series pattern of a performance counter that persists but does not recur. We used a supervised and an unsupervised techniques to automatically identify the important performance counters that are likely indicators of discontinuities within performance data. We compared the performance of our approaches by conducting a case study on the performance data obtained from a large scale cloud provider as well as on open source benchmarks systems. The supervised counter selection approach has superior results in terms of unsupervised approach but bears and overhead of manual labelling of the performance data.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.