XDataExplorer: A Three-Stage Comprehensive Self-Tuning Tool for Big Data Platforms

Qing Guo,Qiushi Li,Yifan Zhu,Yingying Xie

doi:10.1016/j.bdr.2022.100329

Abstract

To meet the challenges of massive data, many big data platforms have been used in practice. In these data processing platforms, there are many correlated parameters that have an impact on processing performance; thus, it is challenging to configure these parameters properly for users with different roles. This paper proposes XDataExplorer, a new comprehensive self-tuning tool for big data platforms, which is based on a three-stage optimization approach that optimizes performance successively at the system level, application level and fine-grain tuning. System-level optimization is guided by expert knowledge that is used to update the system variables. Based on the metrics that are computed by collecting recent application history, application-level optimization is achieved using rule-based heuristic methods. The last step of fine-grain tuning uses a hill-climbing algorithm to iterative examine combinations of system-level and application-level parameters. Several performance optimization best practices, expertise, and heuristic rules are also summarized in this paper. Through different stages of gradual tuning, the proposed tool can rapidly improve the productivity of a big data platform and make processes run more efficiently. System evaluations show that, with the suggested configurations, performance can be improved by between 15% and 60% for different workloads compared to default configurations.

Full Text