Abstract

PurposeThis paper aims to evaluate different approaches for the parallelization of compute-intensive tasks. The study compares a Java multi-threaded algorithm, distributed computing solutions with MapReduce (Apache Hadoop) and resilient distributed data set (RDD) (Apache Spark) paradigms and a graphics processing unit (GPU) approach with Numba for compute unified device architecture (CUDA).Design/methodology/approachThe paper uses a simple but computationally intensive puzzle as a case study for experiments. To find all solutions using brute force search, 15! permutations had to be computed and tested against the solution rules. The experimental application comprises a Java multi-threaded algorithm, distributed computing solutions with MapReduce (Apache Hadoop) and RDD (Apache Spark) paradigms and a GPU approach with Numba for CUDA. The implementations were benchmarked on Amazon-EC2 instances for performance and scalability measurements.FindingsThe comparison of the solutions with Apache Hadoop and Apache Spark under Amazon EMR showed that the processing time measured in CPU minutes with Spark was up to 30% lower, while the performance of Spark especially benefits from an increasing number of tasks. With the CUDA implementation, more than 16 times faster execution is achievable for the same price compared to the Spark solution. Apart from the multi-threaded implementation, the processing times of all solutions scale approximately linearly. Finally, several application suggestions for the different parallelization approaches are derived from the insights of this study.Originality/valueThere are numerous studies that have examined the performance of parallelization approaches. Most of these studies deal with processing large amounts of data or mathematical problems. This work, in contrast, compares these technologies on their ability to implement computationally intensive distributed algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call