Abstract

Datacenter traffic has increased significantly due to rising number of web applications on Internet. These applications have diverse Quality of Service (QoS) requirements making datacenter management a complex task. For a datacenter the amount of resources required for a given resource type (computing, memory, network and storage) is termed as workload. In cloud datacenters, workload classification and characterization is used for resource management, application performance management, capacity sizing, and for estimating the future resource demand. An accurate estimation of future resource demand helps in meeting QoS requirements and ensure efficient resource utilization. Thus modeling and characterization of datacenter workloads becomes necessary to meet performance requirements of applications in a cost-efficient manner. In this paper, a methodology to classify datacenter workloads and characterize them based on resource usage is proposed. Two different workloads have been used, one is Google Cluster Trace (GCT) dataset and other is Bit Brains Trace (BBT) dataset. Seven different machine learning algorithms for workload classification have been used. Workload distribution is estimated in a mix of heterogeneous applications for both GCT and BBT. The seven machine learning algorithms have been compared on the basis of their classification accuracy. Finally, an algorithm to estimate the importance of different attributes for classification is proposed in this paper.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call