Abstract

An established way of validating and testing new image quality assessment (IQA) algorithms have been to compare how well they correlate with subjective data on various image databases. One of the most common measures is to calculate linear correlation coefficient (LCC) and Spearman’s rank order correlation coefficient (SROCC) against the subjective mean opinion score (MOS). Recently, databases with multiply distorted images have emerged <sup>1,2</sup>. However with multidimensional stimuli, there is more disagreement between observers as the task is more preferential than that of distortion detection. This reduces the statistical differences between image pairs. If the subjects cannot distinguish a difference between some of the image pairs, should we demand any better performance with IQA algorithms? This paper proposes alternative performance measures for the evaluation of IQA’s for the CID2013 database. One proposed alternative performance measure is root-mean-square-error (RMSE) value for the subjective data as a function of the number of observers. The other alternative performance measure is the number of statistical differences between image pairs. This study shows that after 12 subjects the RMSE value saturates around the level of three, meaning that a target RMSE value for an IQA algorithm for CID2013 database should be three. In addition, this study shows that the state-of-the-art IQA algorithms found the better image from the image pairs with a probability of 0.85 when the image pairs with statistically significant differences were taken into account.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call