The problem of No-Reference Image Quality Assessment (NR-IQA) is to predict the perceptual quality of an image in line with its subjective evaluation. However, the vulnerabilities of NR-IQA models to the adversarial attacks have not been thoroughly studied for model refinement. This paper aims to investigate the potential loopholes of NR-IQA models via black-box adversarial attacks. Specifically, we first formulate the attack problem as maximizing the deviation between the estimated quality scores of original and perturbed images, while restricting the perturbed image distortions for visual quality preservation. Under such formulation, we then design a Bi-directional loss function to mislead the estimated quality scores of adversarial examples towards an opposite direction with maximum deviation. On this basis, we finally develop an efficient and effective black-box attack method for NR-IQA models based on a random search paradigm. Comprehensive experiments on three benchmark datasets show that all evaluated NR-IQA models are significantly vulnerable to the proposed attack method. After being attacked, the average change rates in terms of two well-known IQA performance metrics achieved by victim models reach 97% and 101%, respectively. In addition, our attack method also outperforms a newly introduced black-box attack approach on IQA models. We also observe that the generated perturbations are not transferable, which points out a new research direction in NR-IQA community. The source code is available at https://github.com/GZHU-DVL/AttackIQA.
Read full abstract