Algorithmic Proficiency in Spark Configuration Tuning: An Empirical Study using Execution Time Metrics across Varied Workloads

Piyush Sewal,Hari Singh

doi:10.1016/j.procs.2024.04.219

Abstract

In the realm of big data, where datasets of immense scale pose processing challenges, distributed processing platforms like open-source Apache Spark have emerged to address these issues. Spark’s internal configuration parameters exert varying impacts on execution times based on job characteristics, making manual optimization daunting. The core focus of this study lies in optimizing Spark’s internal configurations, with specific attention directed towards three types of workloads: Iterative-intensive, Memory-intensive, and CPU-intensive. Employing Grid Search, Random Search, and Evolutionary Optimization algorithms yields substantial execution time reductions: 23.24% with Grid Search, 19.71% with Random Search, and 23.06% with Evolutionary Optimization. Notably, Evolutionary Optimization achieves optimal configurations approximately 29% faster than Grid Search. While Random Search and Evolutionary Optimization share similar time requirements, Random Search’s execution time reduction for a given Spark workload is relatively lower. This research sheds light on algorithmic configuration tuning intricacies and its influence on Spark workload execution times, contributing to the exploration of optimizing big data processing platforms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Algorithmic Proficiency in Spark Configuration Tuning: An Empirical Study using Execution Time Metrics across Varied Workloads

Abstract

Talk to us

Similar Papers

More From: Procedia Computer Science

Lead the way for us

Journal: Procedia Computer Science	Publication Date: Jan 1, 2024
License type: cc-by-nc-nd

Similar Papers

Empirical investigation of hyperparameter optimization for software defect count prediction
Meetesh Nevendra ... Pradeep Singh
Expert Systems with Applications | VOL. 191
Meetesh Nevendra, et. al.Meetesh Nevendra ... Pradeep Singh
22 Nov 2021
Expert Systems with Applications | VOL. 191

A hyper parameterized artificial neural network approach for prediction of the factor of safety against liquefaction
Talas Fikret Kurnaz ... Alparslan Serhat Demir
Engineering Geology | VOL. 319
Talas Fikret Kurnaz, et. al.Talas Fikret Kurnaz ... Alparslan Serhat Demir
03 Apr 2023
Engineering Geology | VOL. 319

Forecast of the COVID-19 Epidemic Based on RF-BOA-LightGBM.
Zhe Li ... Dehua Hu
Healthcare | VOL. 9
Zhe Li, et. al.Zhe Li ... Dehua Hu
06 Sep 2021
Healthcare | VOL. 9

A novel approach for prediction of groundwater quality using gradient boosting-based algorithms
Hemant Raheja ... Mahesh Pal
ISH Journal of Hydraulic Engineering | VOL. 30
Hemant Raheja, et. al.Hemant Raheja ... Mahesh Pal
16 Feb 2024
ISH Journal of Hydraulic Engineering | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Algorithmic Proficiency in Spark Configuration Tuning: An Empirical Study using Execution Time Metrics across Varied Workloads

Abstract

Talk to us

Similar Papers

More From: Procedia Computer Science