Abstract

Stragglers are commonly accepted to have a great impact on the performance of big data system. However, the reason to cause straggler is complicated. Previous works mostly focus on straggler detection, scheduling optimization, and coarse-grained root-cause analysis. These methods fail to provide useful insights to help users optimize their programs. In this paper, we propose BigRoots, a general method incorporating both framework and system features for root-cause analysis of stragglers in the big data system. BigRoots analyzes the stragglers using features from big data framework such as shuffle read/write bytes and JVM garbage collection time, as well as system resource utilization, such as CPU, I/O, and network, which is able to detect both internal and external causes of stragglers. We verify BigRoots by injecting high resource utilization across different system components and perform case studies to analyze different workloads in Hibench. The experimental results demonstrate that BigRoots is effective to identify the root causes of stragglers and provide useful guidance for performance optimization. Based on the root causes identified by BigRoots, the workloads achieve significant performance improvement (by 37.74% in the best case) after optimization.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.