Abstract

To improve resource efficiency and design intelligent scheduler for clouds, it is necessary to understand the workload characteristics and machine utilization in large-scale cloud data centers. In this paper, we perform a deep analysis on a newly released trace dataset by Alibaba in September 2017, consists of detail statistics of 11089 online service jobs and 12951 batch jobs co-locating on 1300 machines over 12 hours. To the best of our knowledge, this is one of the first work to analyze the Alibaba public trace. Our analysis reveals several important insights about different types of imbalance in the Alibaba cloud. Such imbalances exacerbate the complexity and challenge of cloud resource management, which might incur severe wastes of resources and low cluster utilization. 1) Spatial Imbalance: heterogeneous resource utilization across machines and workloads. 2) Temporal Imbalance: greatly time-varying resource usages per workload and machine. 3) Imbalanced proportion of multi-dimensional resources (CPU and memory) utilization per workload. 4) Imbalanced resource demands and runtime statistics (duration and task number) between online service and offline batch jobs. We argue accommodating such imbalances during resource allocation is critical to improve cluster efficiency, and will motivate the emergence of new resource managers and schedulers.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.