GPU Acceleration of the Most Apparent Distortion Image Quality Assessment Algorithm

J A Holloway ,Sohum Sohoni,Yi Zhang,Vignesh Kannan,Damon M Chandler

doi:10.3390/jimaging4100111

Abstract

The primary function of multimedia systems is to seamlessly transform and display content to users while maintaining the perception of acceptable quality. For images and videos, perceptual quality assessment algorithms play an important role in determining what is acceptable quality and what is unacceptable from a human visual perspective. As modern image quality assessment (IQA) algorithms gain widespread adoption, it is important to achieve a balance between their computational efficiency and their quality prediction accuracy. One way to improve computational performance to meet real-time constraints is to use simplistic models of visual perception, but such an approach has a serious drawback in terms of poor-quality predictions and limited robustness to changing distortions and viewing conditions. In this paper, we investigate the advantages and potential bottlenecks of implementing a best-in-class IQA algorithm, Most Apparent Distortion, on graphics processing units (GPUs). Our results suggest that an understanding of the GPU and CPU architectures, combined with detailed knowledge of the IQA algorithm, can lead to non-trivial speedups without compromising prediction accuracy. A single-GPU and a multi-GPU implementation showed a 24× and a 33× speedup, respectively, over the baseline CPU implementation. A bottleneck analysis revealed the kernels with the highest runtimes, and a microarchitectural analysis illustrated the underlying reasons for the high runtimes of these kernels. Programs written with optimizations such as blocking that map well to CPU memory hierarchies do not map well to the GPU’s memory hierarchy. While compute unified device architecture (CUDA) is convenient to use and is powerful in facilitating general purpose GPU (GPGPU) programming, knowledge of how a program interacts with the underlying hardware is essential for understanding performance bottlenecks and resolving them.

Highlights

Images and videos undergo several transformations from capture to display in various formats.The key to transferring these contents over networks lies in the design of image/video analysis and processing algorithms that can simultaneously tackle two opposing goals: (1) The ability to handle potentially massive content sizes (e.g., 8 K video); while (2) achieving the results in a timely fashion on practical computing hardware
In this paper, we address the issue of graphics processing units (GPUs) acceleration of the perceptual and statistical processing stages used in quality assessment (QA), in the hopes to inform the design, implementation, and deployment of future multimedia QA systems
Obtaining performance gains through general purpose GPU (GPGPU) solutions is an attractive area of research

Summary

Introduction

Images and videos undergo several transformations from capture to display in various formats. The key novelties compared to our prior work are as follows: (1) we implemented all of MAD in CUDA, thereby allowing analyses of all of the stages (detection, appearance, memory transfers); (2) we tested three different GPUs to examine the effects of different GPU architectures; (3) we further analyzed the differences between using a single GPU vs parallelizing MAD across three GPUs (multi-GPU implementation) to investigate how the results scale with the number of GPUs; and (4) we performed a microarchitectural analysis of the key bottleneck kernel (the CUDA kernel which computes the various local statistics) to gain insight into and inform future implementations about the GPU memory infrastructure usage.

Related Work

Acceleration of QA Algorithms

GPU-based Acceleration on Other Image-Processing-Related Techniques

Description of the MAD Algorithm

Visual Detection Stage

Visual Appearance Stage

Overall MAD Score

CPU Tasks

Visual Detection Stage in CUDA

Visual Appearance Stage in CUDA

Results and Analysis

Evaluation 1

Per-Kernel Performance on the Detection Stage

Per-Kernel Performance on the Appearance Stage

Evaluation 2

2.31 Tflops

Evaluation 3

Evaluation 4

Memory Statistics—Global

Memory Statistics—Local

Memory Statistics—Caches

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of imaging	Publication Date: Sep 25, 2018
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

GPU Acceleration of the Most Apparent Distortion Image Quality Assessment Algorithm

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of imaging

Lead the way for us

Similar Papers

Survey of information theory in visual quality assessment
Rajiv Soundararajan ... Alan C Bovik
Signal, image and video processing | VOL. 7
Rajiv Soundararajan, et. al.Rajiv Soundararajan ... Alan C Bovik
17 Mar 2013
Signal, image and video processing | VOL. 7

General Purpose Computation on Graphics Processing Units Using OpenCL

-

01 Jan 2013
01 Jan 2013

Microarchitectural analysis of image quality assessment algorithms
Thien D Phan ... Siddharth K Shah
Journal of electronic imaging | VOL. 23
Thien D Phan, et. al.Thien D Phan ... Siddharth K Shah
26 Feb 2014
Journal of electronic imaging | VOL. 23

Import of distortion on saliency applied to image quality assessment
Qing Wang ... Lin Xu
-
Qing Wang, et. al.Qing Wang ... Lin Xu
01 Oct 2014
01 Oct 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GPU Acceleration of the Most Apparent Distortion Image Quality Assessment Algorithm

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of imaging