Abstract
SummaryWith the emergence of social networks and improvements in computational photography, billions of JPEG images are shared and viewed on a daily basis. Desktops, tablets, and smartphones constitute the vast majority of hardware platforms used for displaying JPEG images. Despite the fact that these platforms are heterogeneous multicores, no approach exists yet that is capable of joining forces of a system's CPU and graphics processing unit (GPU) for JPEG decoding. In this paper, we introduce a novel JPEG decoding scheme for heterogeneous architectures consisting of a CPU and a general‐purpose GPU. We employ an offline profiling step to determine the performance of a system's CPU and GPU with respect to JPEG decoding. For a given JPEG image, our performance model uses: (1) the CPU and GPU performance characteristics, (2) the image entropy, and (3) the width and height of the image to balance the JPEG decoding workload on the underlying hardware. Our run‐time partitioning and scheduling scheme exploits task, data, and pipeline parallelism by scheduling the non‐parallelizable entropy‐decoding task on the CPU, whereas inverse discrete cosine transformations, color conversions, and upsampling are conducted on both the CPU and the GPU. We have implemented the proposed method in the context of the libjpeg‐turbo library, which is an industrial‐strength JPEG encoding and decoding engine. Libjpeg‐turbo's hand‐optimized SIMD routines for ARM and x86 architectures constitute a competitive yardstick for the comparison with the proposed approach. We have evaluated our approach for a total of 7194 JPEG images across four high‐end and middle‐end CPU–GPU combinations including a mobile GPU. We achieve speedups of up to 5.2× over the SIMD version of libjpeg‐turbo, and speedups of up to 10.5× over its sequential code. Taking into account the non‐parallelizable JPEG entropy‐decoding part, our approach achieves up to 97% of the theoretically attainable maximal speedup, with an average of 94%. Copyright © 2015 John Wiley & Sons, Ltd.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Concurrency and Computation: Practice and Experience
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.