Abstract

Image quality assessment (IQA) algorithms aim to predict perceived image quality by human observers. Over the last two decades, a large amount of work has been carried out in the field. New algorithms are being developed at a rapid rate in different areas of IQA, but are often tested and compared with limited existing models using out-of-date test data. There is a significant gap when it comes to large-scale performance evaluation studies that include a wide variety of test data and competing algorithms. In this work we aim to fill this gap by carrying out the largest performance evaluation study so far. We test the performance of 43 full-reference (FR), seven fused FR (22 versions), and 14 no-reference (NR) methods on nine subject-rated IQA datasets, of which five contain singly distorted images and four contain multiply distorted content. We use a variety of performance evaluation and statistical significance testing criteria. Our findings not only point to the top performing FR and NR IQA methods, but also highlight the performance gap between them. In addition, we have also conducted a comparative study on FR fusion methods, and an important discovery is that rank aggregation based FR fusion is able to outperform not only other FR fusion approaches but also the top performing FR methods. It may be used to annotate IQA datasets as a possible alternative to subjective ratings, especially in situations where it is not possible to obtain human opinions, such as in the case of large-scale datasets composed of thousands or even millions of images.

Highlights

  • Image quality assessment (IQA) can be broadly categorized into subjective and objective quality assessment (QA)

  • In this work we evaluate the performance of 43 FR IQA methods which are listed in Table 3 along with information about whether a method operates on color or grayscale images, year of publication, and the number and names of the IQA databases that it was tested on

  • In this work we evaluate the performance of seven FR fusion based methods which are listed in Table 4 along with information about whether they operate on grayscale or color images, year of publication, and number and names of the IQA databases that they were tested on

Read more

Summary

INTRODUCTION

Image quality assessment (IQA) can be broadly categorized into subjective and objective quality assessment (QA). A comprehensive review of basic computational building blocks used in the design of perceptual IQA metrics is given in [24] along with a description of six FR IQA methods The performance of these methods is evaluated on seven IQA databases (A57 [7], CSIQ [6], IVC [10], LIVE R2 [3], MICT [9], TID2008 [4], and WIQ [8]) in terms of PLCC and SRCC. Reference [34] uses both TID2008 [4] and TID2013 [5] datasets, where the latter contains all the reference and distorted images of the former Given these shortcomings, it is evident that existing surveys are unable to identify the top performing FR, fused FR, and NR methods in a competitive and comparative setting.

REVIEW OF IQA DATABASES
FULL-REFERENCE IMAGE QUALITY ASSESSMENT
FR FUSION BASED IMAGE QUALITY ASSESSMENT
Findings
PERFORMANCE ANALYSIS OF NR METHODS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call