Task-based programming in COMPSs to converge from HPC to big data

Javier Conejero,Jesus Labarta,Sandra Corella,Rosa M Badia

doi:10.1177/1094342017701278

Abstract

Task-based programming has proven to be a suitable model for high-performance computing (HPC) applications. Different implementations have been good demonstrators of this fact and have promoted the acceptance of task-based programming in the OpenMP standard. Furthermore, in recent years, Apache Spark has gained wide popularity in business and research environments as a programming model for addressing emerging big data problems. COMP Superscalar (COMPSs) is a task-based environment that tackles distributed computing (including Clouds) and is a good alternative for a task-based programming model for big data applications. This article describes why we consider that task-based programming models are a good approach for big data applications. The article includes a comparison of Spark and COMPSs in terms of architecture, programming model, and performance. It focuses on the differences that both frameworks have in structural terms, on their programmability interface, and in terms of their efficiency by means of three widely known benchmarking kernels: Wordcount, Kmeans, and Terasort. These kernels enable the evaluation of the more important functionalities of both programming models and analyze different work flows and conditions. The main results achieved from this comparison are (1) COMPSs is able to extract the inherent parallelism from the user code with minimal coding effort as opposed to Spark, which requires the existing algorithms to be adapted and rewritten by explicitly using their predefined functions, (2) it is an improvement in terms of performance when compared with Spark, and (3) COMPSs has shown to scale better than Spark in most cases. Finally, we discuss the advantages and disadvantages of both frameworks, highlighting the differences that make them unique, thereby helping to choose the right framework for each particular objective.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Task-based programming in COMPSs to converge from HPC to big data

Abstract

Talk to us

Similar Papers

More From: The International Journal of High Performance Computing Applications

Lead the way for us

Journal: The International Journal of High Performance Computing Applications	Publication Date: Apr 6, 2017
Citations: 25

Similar Papers

HPC Process and Optimal Network Device Affinitization
Ravindra Babu Ganapathi ... Aravind Gopalakrishnan
IEEE Transactions on Multi-Scale Computing Systems | VOL. 4
Ravindra Babu Ganapathi, et. al.Ravindra Babu Ganapathi ... Aravind Gopalakrishnan
01 Oct 2018
IEEE Transactions on Multi-Scale Computing Systems | VOL. 4

Regression-Based Prediction for Task-Based Program Performance
Isil Oz ... Muhammad Khurram Bhatti
Journal of Circuits, Systems and Computers | VOL. 28
Isil Oz, et. al.Isil Oz ... Muhammad Khurram Bhatti
31 Mar 2019
Journal of Circuits, Systems and Computers | VOL. 28

Boosting Atmospheric Dust Forecast with PyCOMPSs
Javier Conejero ... Rosa M Badia
-
Javier Conejero, et. al.Javier Conejero ... Rosa M Badia
01 Oct 2018
01 Oct 2018

SciDP: Support HPC and Big Data Applications via Integrated Scientific Data Processing
Kun Feng ... Xi Yang
-
Kun Feng, et. al.Kun Feng ... Xi Yang
01 Sep 2018
01 Sep 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Task-based programming in COMPSs to converge from HPC to big data

Abstract

Talk to us

Similar Papers

More From: The International Journal of High Performance Computing Applications