Abstract

In this work, we present a convolutional neural network (CNN) named CGFA-CNN for blind image quality assessment (BIQA). A unique two-stage strategy is utilized which firstly identifies the distortion type in an image using Sub-Network I and then quantifies this distortion using Sub-Network II. Different from most deep neural networks, we extract hierarchical features as descriptors to enhance the image representation and design a feature aggregation layer in an end-to-end training manner applying Fisher encoding to visual vocabularies modeled by Gaussian mixture models (GMMs). Considering the authentic distortions and synthetic distortions, the hierarchical feature contains the characteristics of a CNN trained on the self-built dataset and a CNN trained on ImageNet. We evaluated our algorithm on four publicly available databases, and the results demonstrate that our CGFA-CNN has superior performance over other methods both on synthetic and authentic databases.

Highlights

  • Digital pictures may occur different distortions in the procedure of acquisition, transmission, and compression, leading to an unsatisfactory perceived visual quality or a certain level of annoyance.it is crucial to predict the quality of digital pictures in many applications, such as compression, communication, printing, display, analysis, registration, restoration, and enhancement [1,2,3]

  • We propose an end-to-end blind image quality assessment (BIQA) based on classification guidance and feature aggregation, which is accomplished by two sub-networks with shared features in the early layers

  • This allows the proposed CGFA-convolutional neural network (CNN) to accept an image of any size as the input, there is no need to perform any transformation of images, which would affect perceptual quality scores

Read more

Summary

Introduction

Digital pictures may occur different distortions in the procedure of acquisition, transmission, and compression, leading to an unsatisfactory perceived visual quality or a certain level of annoyance. General BIQA methods aim to work well for arbitrary distortion, which can be classified into two categories according to the features extracted, i.e., Natural Scene Statistics (NSS)-based methods and training-based methods. We propose an end-to-end BIQA based on classification guidance and feature aggregation, which is accomplished by two sub-networks with shared features in the early layers. A fully connected layer is exploited as a linear regression model to map the high-dimensional features into the quality scores This allows the proposed CGFA-CNN to accept an image of any size as the input, there is no need to perform any transformation of images (including cropping, scaling, etc.), which would affect perceptual quality scores.

Related Work
The Proposed Method
Construction of the Pre-Training Dataset
Sub-Network I Architecture
Feature Extraction and Fusion
Feature Aggregation Layer and Encoding
GMM Clustering
Fisher Encoding
Beyond the FV Aggregation
Classification-Guided Gating Unit and Quality Prediction
Experimental Results and Discussions
Experimental Settings
Consistency Experiment
Method
Cross-Database Experiment
Comparison among Different Experimental Settings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.