Abstract

Current low-cost general-purpose single-board computing (SDC) devices are gaining increasing interests in research computing due to their very low cost/performance ratio and energy consumption. Among all the SDCs available nowadays, Raspberry Pi devices constitute maybe the most renowned representatives. On the other hand, the wavelet transform plays an important role in contemporary standards for image compression (such as JPEG-2000) and video compression (MPEG-4). In this work, we present and evaluate three parallelization strategies of the 3D fast wavelet transform (3D-FWT) on a cluster of Raspberry Pi 2 SDCs. Each parallelization strategy has been implemented using both POSIX Threads (shared memory) and MPI (message passing). The set of implementations using POSIX Threads is restricted to runs on a single board, whereas multiple boards can be used for the MPI versions. We find out that noticeable speed-ups can be obtained when all MPI processes or POSIX Threads are run using the cores of a single Raspberry Pi 2 SDC. However, in the case of the MPI versions, we observe that performance drops drastically when all MPI processes spread to several boards. The reason for this is the limited bandwidth that the onboard LAN port can deliver, and that proves insufficient for the fine-grained, high-volume communication requirements of the studied parallelization strategies. Finally, we have also considered the execution of the POSIX Threads and MPI versions on a very high-performance but power-hungry 4-core Intel Xeon CPU E5606, obtaining that the Raspberry Pi 2 SDC can do the task with much lower total energy consumption (up to 4 times).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call