Abstract
Multithreaded latency-critical applications represent an important subset of workloads running on public cloud systems. Most of these systems deploy powerful computing servers including Intel Hyper-Threading processors. Understanding how performance is affected by the consumption of the main system resources is a major concern for cloud providers in order to devise virtualization strategies that improve the system efficiency. With this aim, this paper first characterizes the impact of QPS on tail latency, analyzing different scenarios varying the number of threads and the thread-to-core allocation (single-task and multi-task execution) policy. The characterization study reveals that the performance of some applications does not scale with the number of threads, and the performance of some others is insensitive to the Hyper-Threading technology, so they can be allocated in less physical cores and improve system utilization. Identifying these applications, however, at run-time is challenging. Despite identifying these applications at run-time is challenging, this paper shows that they can be successfully detected at run-time by analyzing the utilization trend of the major system resources. In addition to CPU, we have also studied how assigning the share of each application of other major shared system resources impacts on performance. We outline considerations cloud providers should take into account to improve performance and resource utilization.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.