Abstract
In this work, we present a convolutional neural network (CNN) named CGFA-CNN for blind image quality assessment (BIQA). A unique two-stage strategy is utilized which firstly identifies the distortion type in an image using Sub-Network I and then quantifies this distortion using Sub-Network II. Different from most deep neural networks, we extract hierarchical features as descriptors to enhance the image representation and design a feature aggregation layer in an end-to-end training manner applying Fisher encoding to visual vocabularies modeled by Gaussian mixture models (GMMs). Considering the authentic distortions and synthetic distortions, the hierarchical feature contains the characteristics of a CNN trained on the self-built dataset and a CNN trained on ImageNet. We evaluated our algorithm on four publicly available databases, and the results demonstrate that our CGFA-CNN has superior performance over other methods both on synthetic and authentic databases.
Highlights
Digital pictures may occur different distortions in the procedure of acquisition, transmission, and compression, leading to an unsatisfactory perceived visual quality or a certain level of annoyance.it is crucial to predict the quality of digital pictures in many applications, such as compression, communication, printing, display, analysis, registration, restoration, and enhancement [1,2,3]
We propose an end-to-end blind image quality assessment (BIQA) based on classification guidance and feature aggregation, which is accomplished by two sub-networks with shared features in the early layers
This allows the proposed CGFA-convolutional neural network (CNN) to accept an image of any size as the input, there is no need to perform any transformation of images, which would affect perceptual quality scores
Summary
Digital pictures may occur different distortions in the procedure of acquisition, transmission, and compression, leading to an unsatisfactory perceived visual quality or a certain level of annoyance. General BIQA methods aim to work well for arbitrary distortion, which can be classified into two categories according to the features extracted, i.e., Natural Scene Statistics (NSS)-based methods and training-based methods. We propose an end-to-end BIQA based on classification guidance and feature aggregation, which is accomplished by two sub-networks with shared features in the early layers. A fully connected layer is exploited as a linear regression model to map the high-dimensional features into the quality scores This allows the proposed CGFA-CNN to accept an image of any size as the input, there is no need to perform any transformation of images (including cropping, scaling, etc.), which would affect perceptual quality scores.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.