Abstract
The growing size and complexity of cloud systems determine scalability issues for resource monitoring and management. While most existing solutions consider each Virtual Machine (VM) as a black box with independent characteristics, we embrace a new perspective where VMs with similar behaviors in terms of resource usage are clustered together. We argue that this new approach has the potential to address scalability issues in cloud monitoring and management. In this paper, we propose a technique to cluster VMs starting from the usage of multiple resources, assuming no knowledge of the services executed on them. This innovative technique models VMs behavior exploiting the probability histogram of their resources usage, and performs smoothing-based noise reduction and selection of the most relevant information to consider for the clustering process. Through extensive evaluation, we show that our proposal achieves high and stable performance in terms of automatic VM clustering, and can reduce the monitoring requirements of cloud systems.
Highlights
The cloud computing paradigm has emerged in the last few years as a way to cope with the demands of modern application exploiting virtualization techniques in large data centers
It is worth to note that in this and following experiments we use a value of σ2 equal to 0.25 for the smoothing function in Equation 1. The choice of this value is motivated by preliminary tests, not reported here for space reasons, where we evaluate the impact of this parameter on the Bhattacharyya distance of non-smoothed and smoothed histograms for multiple metrics and Virtual Machine (VM): we found that any value of σ2 that is 0.1 < σ2 < 0.4 provides a significant noise reduction, while preserving the main shape of the histograms
We propose a novel technique, namely SH-based, for automatic clustering of VMs sharing similar behavior to improve the scalability of monitoring process in cloud data centers
Summary
The cloud computing paradigm has emerged in the last few years as a way to cope with the demands of modern application exploiting virtualization techniques in large data centers. Many customers are outsourcing services and moving their applications from internal data centers to cloud platforms exploiting long-term commitments, purchasing several VMs for extended periods of time (for example, integrating a data center with the Amazon so-called reserved instances) As this scenario is, and is expected to be in the future, a significant part of the cloud ecosystem [1], we assume in the present study that customer VMs do not change frequently the software component they are running and that a single software component is typically deployed on several different VMs for reliability and scalability purposes. As VMs are traditionally considered as independent black boxes, management strategies require to collect information about each single VM of the data center This means that gathering data about VMs exhibiting similar behaviors results in the collection of redundant information, hindering the scalability of monitoring tasks for the cloud system.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.