In the field of computer vision, the application of hand-crafted as well as computer-learning-based methods in the field of image quality assessment has yielded remarkable results. However, in the field of no-reference image distortion, it is still challenging to accurately perceive and determine the quality of an image. To address the difficulties of Image Quality Assessment (IQA) in the field of authentic distorted images, we consider the use of the Swin Transformer (ST) to extract features. To enable the model to focus on both spatial and channel information of features, we design a plug-and-play Global Self-Attention Block (GSAB). At the same time, we introduce a Transformer block in the model to enhance the model's ability to capture long-range dependencies. Finally, we derive the prediction of image quality scores through a Dual-Branching structure. Our method is experimented on four synthetic datasets as well as two authentic datasets, and the results of the experiments are weighted according to the size of the datasets, and the results show that our method outperforms all the current methods and works well in the Generalization ability test, which proves that our method has a good generalization ability. The code will be posted subsequently at https://github.com/yanggege-new/NR-IQA-based-on-global-awareness.
Read full abstract