Abstract
Many highly parallel algorithms usually generate large volumes of data containing both valid and invalid elements, and high-performance solutions to the stream compaction problem reveal extremely important in such scenarios. Although parallel stream compaction has been extensively studied in GPU-based platforms, and more recently, in the Intel Xeon Phi platform, no study has considered yet its parallelization using a low-cost computing cluster, even when general-purpose single-board computing devices are gaining popularity among the scientific community due to their high performance per $ and watt. In this work, we consider the case of an extremely low-cost cluster composed by four Odroid C2 single-board computers (SDCs), showing that stream compaction can also benefit—important speedups can be obtained—from this kind of platforms. To do so, we derive two parallel implementations for the stream compaction problem using MPI. Then, we evaluate them considering varying number of processes and/or SDCs, as well as different input sizes. In general, we see that unless the number of elements in the stream is too small, the best results are obtained when eight MPI processes are distributed among the four SDCs that conform the cluster. To add value to the obtained results, we also consider the execution of the two parallel implementations for the stream compaction problem on a very high-performance but power-hungry 18-core Intel Xeon E5-2695 v4 multicore processor, obtaining that the Odroid C2 SDC cluster constitutes a much more efficient alternative when both resulting execution time and required energy are taken into account. Finally, we also implement and evaluate a parallel version of the stream split problem to store also the invalid elements after the valid ones. Our implementation shows good scalability on the Odroid C2 SDC cluster and more compensated computation/communication ratio when compared to the stream compaction problem.
Highlights
Continuous improvements in the technologies used to build computers have recently made possible the fabrication of extremely low-cost general-purpose single-board computing devices
Is manuscript extends a preliminary version of this work [27] by making the following two important additional contributions: (i) To highlight the importance of our study, we consider the execution of the two parallel implementations for the stream compaction problem on a very high-performance but power-hungry 18-core Intel Xeon CPU E5-2695 v4
(ii) We derive a parallel version of the stream split problem to append the invalid elements to the output stream of the valid elements. We evaluate it on the Odroid C2 single-board computers (SDCs) cluster, observing good results in terms of scalability that lead to important speedups, and better balance between computation and communication requirements than in the stream compaction problem
Summary
Continuous improvements in the technologies used to build computers have recently made possible the fabrication of extremely low-cost general-purpose single-board computing devices. The initial aim of these devices was to promote the teaching of basic computer science in schools [3, 4] and developing countries [5,6,7], recent appearance of single-board computers with multicore ARM CPU chips and several gigabytes of main memory provides a desirable hardware platform for the project-based learning paradigm in computer science and engineering education [8,9,10,11] and have attracted interest of a multitude of projects trying to take advantage of their very low-cost performance ratio (i.e., for scientific computing [12,13,14]) in contrast with other energyefficient but which are alternatives of higher cost [15]. The Raspberry Pi 2 model B released in February 2015 adds wireless connectivity
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.