Attention Down-Sampling Transformer, Relative Ranking and Self-Consistency For Blind Image Quality Assessment
The no-reference image quality assessment is a challenging domain that addresses estimating image quality without the original reference. We introduce an improved mechanism to extract local and non-local information from images via different transformer encoders and CNNs. The utilization of Transformer encoders aims to mitigate locality bias and generate a non-local representation by sequentially processing CNN features, which inherently capture local visual structures. Establishing a stronger connection between subjective and objective assessments is achieved through sorting within batches of images based on relative distance information. A self-consistency approach to self-supervision is presented, explicitly addressing the degradation of no-reference image quality assessment (NR-IQA) models under equivariant transformations. Our approach ensures model robustness by maintaining consistency between an image and its horizontally flipped equivalent. Through empirical evaluation of five popular image quality assessment datasets, the proposed model outperforms alternative algorithms in the context of no-reference image quality assessment datasets, especially on smaller datasets. Codes are available at https://github.com/mas94/ADTRS
- Research Article
4
- 10.3390/sym13081446
- Aug 6, 2021
- Symmetry
The multi-exposure fusion (MEF) technique provides humans a new opportunity for natural scene representation, and the related quality assessment issues are urgent to be considered for validating the effectiveness of these techniques. In this paper, a curvature and entropy statistics-based blind MEF image quality assessment (CE-BMIQA) method is proposed to perceive the quality degradation objectively. The transformation process from multiple images with different exposure levels to the final MEF image leads to the loss of structure and detail information, so that the related curvature statistics features and entropy statistics features are utilized to portray the above distortion presentation. The former features are extracted from the histogram statistics of surface type map calculated by mean curvature and Gaussian curvature of MEF image. Moreover, contrast energy weighting is attached to consider the contrast variation of the MEF image. The latter features refer to spatial entropy and spectral entropy. All extracted features based on a multi-scale scheme are aggregated by training the quality regression model via random forest. Since the MEF image and its feature representation are spatially symmetric in physics, the final prediction quality is symmetric to and representative of the image distortion. Experimental results on a public MEF image database demonstrate that the proposed CE-BMIQA method achieves more outstanding performance than the state-of-the-art blind image quality assessment ones.
- Conference Article
1
- 10.1145/3451421.3451464
- Dec 5, 2020
Image quality assessment is widely used in many image processing tasks, which can help researchers adjust image processing algorithms, design imaging systems, and evaluate image processing systems. Generally, CT image quality assessment can be categorized into task-specific and general image quality evaluation. Task-specific image quality assessment evaluates the performance of the imaging system or the detectability of the tumor. These IQA index, for example, are modulation transfer function (MTF), Signal-to-Noise Ratio (SNR), observer model, etc. General image quality assessment measures the general reconstruction image quality under different reconstruction algorithms. SSIM (Structural Similarity), Mean Squared Error (MSE), etc. are the traditional general image quality assessment indexes widely used in nowadays CT image quality assessment. The drawback of these indexes is the demand for reference images, which is not practical in the real CT system. In this paper, we design a CT image dataset, and by using this dataset, and we propose a blind image quality assessment (BIQA) model based on CT image statistics, which can be employed to measure the algorithms under no reference image situation. Different from other image datasets, we recruited no-converged images of the reconstruction process in designing datasets, which enables our BIQA model to evaluate non-converged images during the iterations. Hence, the BIQA model can be embedded in the reconstruction process to monitor reconstructed image quality during iterations.
- Research Article
42
- 10.1109/tmm.2019.2938612
- Sep 5, 2019
- IEEE Transactions on Multimedia
Blind image quality assessment (BIQA) aims to develop quantitative measures to automatically and accurately estimate the visual quality of an image without any prior information about its reference image. This issue has been attracting a great deal of attention for a long time; however, little work has been done on night-time images, which are crucially important for consumer photography and practical applications such as automated driving systems. In this paper, to the best of our knowledge, we conduct the first exploration on subjective and objective quality assessment of night-time images. First, we build a large-scale natural night-time image database (NNID) containing 2240 images with 448 different image contents captured by different photographic equipment in real-world scenarios. Subsequently, we carry out a subjective experiment to evaluate the perceptual quality of all the images in the NNID database. Thereafter, we perform objective assessment of night-time images by proposing a blind night-time image quality assessment metric using brightness and texture features (BNBT). Finally, extensive experiments are conducted to evaluate the performance and efficiency of the proposed BNBT metric on the NNID database. The experimental results demonstrate that this metric outperforms existing state-of-the-art BIQA methods in terms of all evaluation criteria and has an acceptable computational cost at the same time. We have made the NNID database publicly available for downloading at https://sites.google.com/site/xiangtaooo/ .
- Research Article
7
- 10.1109/tmi.2024.3418652
- Oct 1, 2024
- IEEE transactions on medical imaging
Lowering radiation dose per view and utilizing sparse views per scan are two common CT scan modes, albeit often leading to distorted images characterized by noise and streak artifacts. Blind image quality assessment (BIQA) strives to evaluate perceptual quality in alignment with what radiologists perceive, which plays an important role in advancing low-dose CT reconstruction techniques. An intriguing direction involves developing BIQA methods that mimic the operational characteristic of the human visual system (HVS). The internal generative mechanism (IGM) theory reveals that the HVS actively deduces primary content to enhance comprehension. In this study, we introduce an innovative BIQA metric that emulates the active inference process of IGM. Initially, an active inference module, implemented as a denoising diffusion probabilistic model (DDPM), is constructed to anticipate the primary content. Then, the dissimilarity map is derived by assessing the interrelation between the distorted image and its primary content. Subsequently, the distorted image and dissimilarity map are combined into a multi-channel image, which is inputted into a transformer-based image quality evaluator. By leveraging the DDPM-derived primary content, our approach achieves competitive performance on a low-dose CT dataset.
- Research Article
- 10.1145/3720547
- May 22, 2025
- ACM Transactions on Multimedia Computing, Communications, and Applications
Blind panoramic image quality assessment (BPIQA) has recently brought a new challenge to the visual quality community, due to the complex interaction between immersive content and human behavior. Although many efforts have been made to advance BPIQA from both conducting psychophysical experiments and designing performance-driven objective algorithms, limited content and few samples in those closed sets inevitably would result in shaky conclusions, thereby hindering the development of BPIQA; we refer to it as the easy-database issue. In this article, we present a sufficient computational analysis of degradation modeling in BPIQA to thoroughly explore the easy-database issue , where we carefully design three types of experiments via investigating the gap between BPIQA and blind image quality assessment (BIQA), the necessity of specific design in BPIQA models, and the generalization ability of BPIQA models. From extensive experiments, we find that easy databases narrow the gap between the performance of BPIQA and BIQA models, which is unconducive to the development of BPIQA. And the easy databases make the BPIQA models be closed to saturation; therefore, the effectiveness of the associated specific designs cannot be well verified. Besides, the BPIQA models trained on our recently proposed databases with complicated degradation show better generalization ability. Thus, we believe that much more efforts are highly desired to put into BPIQA from both subjective viewpoint and objective viewpoint.
- Research Article
23
- 10.3390/e20110885
- Nov 17, 2018
- Entropy
Blind/no-reference image quality assessment is performed to accurately evaluate the perceptual quality of a distorted image without prior information from a reference image. In this paper, an effective blind image quality assessment approach based on entropy differences in the discrete cosine transform domain for natural images is proposed. Information entropy is an effective measure of the amount of information in an image. We find the discrete cosine transform coefficient distribution of distorted natural images shows a pulse-shape phenomenon, which directly affects the differences of entropy. Then, a Weibull model is used to fit the distributions of natural and distorted images. This is because the Weibull model sufficiently approximates the pulse-shape phenomenon as well as the sharp-peak and heavy-tail phenomena of natural scene statistics rules. Four features that are related to entropy differences and human visual system are extracted from the Weibull model for three scaling images. Image quality is assessed by the support vector regression method based on the extracted features. This blind Weibull statistics algorithm is thoroughly evaluated using three widely used databases: LIVE, TID2008, and CSIQ. The experimental results show that the performance of the proposed blind Weibull statistics method is highly consistent with that of human visual perception and greater than that of the state-of-the-art blind and full-reference image quality assessment methods in most cases.
- Research Article
- 10.1016/j.jvcir.2024.104152
- Apr 16, 2024
- Journal of Visual Communication and Image Representation
Blind cartoon image quality assessment based on local structure and chromatic statistics
- Research Article
51
- 10.1109/access.2020.2972158
- Jan 1, 2020
- IEEE Access
In contrast with traditional images, omnidirectional image (OI) has a higher resolution and provides the user with an interactive wide field of view. OI with equirectangular projection (ERP) format, as the default for encoding and transmitting omnidirectional visual contents, is not suitable for quality assessment of OI because of serious geometric distortion in the bipolar regions, especially for blind image quality assessment. In this paper, a segmented spherical projection (SSP) based blind omnidirectional image quality assessment (SSP-BOIQA) method is proposed. The OI with ERP format is first converted into that with SSP format, so as to solve the problem of stretching distortion in the bipolar regions of ERP format, but retain the equatorial region of ERP format. On the one hand, considering that the bipolar regions of the SSP format are circular, a local/global perceptual features extraction scheme with fan-shaped window is proposed for estimating the distortion in the bipolar regions of OI. On the other hand, the perceptual features of the equatorial region are extracted with heat map as weighting factor to reflect users' visual behavior. Then, the features extracted from the OI's bipolar and equatorial regions are pooled to predict the quality of distorted OIs. The experiments on two databases, namely CVIQD2018 and MVAQD databases, demonstrate that the proposed SSP-BOIQA method outperforms the state-of-the-art blind quality assessment methods, and is more consistent with human visual perception.
- Book Chapter
- 10.1007/978-981-10-7629-9_21
- Jan 1, 2018
Blind image quality assessment metrics play an important role in the field of image processing. Blind image quality assessment methods, which are specific to a given type of distortion, are very popular for different image processing applications. JPEG compression is one of the most common image compression methods. In this paper, a support vector regression approach is adopted to assess the quality of JPEG compressed images without reference image. At first, the low frequency feature in DCT domain and the blockiness feature are calculated to present the distortion information of an image. Second, the JPEG dataset is divided into training and testing set randomly. The training set is used to build the SVR model, and the testing set is used to predict the quality score. Finally, combining with MOS or DMOS, the quality score is predicted by SVR model. Extensive experiments on LIVE database demonstrate that the proposed method outperforms the state-of-art methods both on predict accuracy and computational complexity.
- Research Article
22
- 10.1016/j.ijleo.2020.164189
- Jan 9, 2020
- Optik
Blind image quality assessment using natural scene statistics of stationary wavelet transform
- Research Article
37
- 10.1109/access.2018.2890304
- Jan 1, 2019
- IEEE Access
We proposed a blind image quality assessment model which used classification and prediction for three-dimensional (3D) image quality assessment (denoted as CAP-3DIQA) that can automatically evaluate the quality of stereoscopic images. First, in the classification stage, the model separated the distorted images into several subsets according to the types of image distortions. This process will assign the images with the same distortion type to the same group. After the classification stage, the classified distorted image set is fed into the image quality predictor that contains five different perceptual channels which predict the image quality score individually. Finally, we used the regression module of the support vector machine to evaluate the final image quality score, where the input of the regression model is the combination of five channel’s outputs. The model, we proposed is tested on three public and popular databases, which are LIVE 3D Image Quality Database Phase I, LIVE 3D Image Quality Database Phase II, and MCL 3D Image Quality Database. The experimental results show that our proposed model leads to significant performance improvement on quality prediction for stereoscopic images compared with other existing state-of-the-art quality metrics.
- Research Article
4
- 10.3389/fnins.2024.1415679
- May 13, 2024
- Frontiers in Neuroscience
Multimodal medical fusion images (MMFI) are formed by fusing medical images of two or more modalities with the aim of displaying as much valuable information as possible in a single image. However, due to the different strategies of various fusion algorithms, the quality of the generated fused images is uneven. Thus, an effective blind image quality assessment (BIQA) method is urgently required. The challenge of MMFI quality assessment is to enable the network to perceive the nuances between fused images of different qualities, and the key point for the success of BIQA is the availability of valid reference information. To this end, this work proposes a generative adversarial network (GAN) -guided nuance perceptual attention network (G2NPAN) to implement BIQA for MMFI. Specifically, we achieve the blind evaluation style via the design of a GAN and develop a Unique Feature Warehouse module to learn the effective features of fused images from the pixel level. The redesigned loss function guides the network to perceive the image quality. In the end, the class activation mapping supervised quality assessment network is employed to obtain the MMFI quality score. Extensive experiments and validation have been conducted in a database of medical fusion images, and the proposed method is superior to the state-of-the-art BIQA method.
- Research Article
1
- 10.3390/s23198205
- Sep 30, 2023
- Sensors
To monitor objects of interest, such as wildlife and people, image-capturing devices are used to collect a large number of images with and without objects of interest. As we are recording valuable information about the behavior and activity of objects, the quality of images containing objects of interest should be better than that of images without objects of interest, even if the former exhibits more severe distortion than the latter. However, according to current methods, quality assessments produce the opposite results. In this study, we propose an end-to-end model, named DETR-IQA (detection transformer image quality assessment), which extends the capability to perform object detection and blind image quality assessment (IQA) simultaneously by adding IQA heads comprising simple multi-layer perceptrons at the top of the DETRs (detection transformers) decoder. Using IQA heads, DETR-IQA carried out blind IQAs based on the weighted fusion of the distortion degree of the region of objects of interest and the other regions of the image; the predicted quality score of images containing objects of interest was generally greater than that of images without objects of interest. Currently, the subjective quality score of all public datasets is in accordance with the distortion of images and does not consider objects of interest. We manually extracted the images in which the five predefined classes of objects were the main contents of the largest authentic distortion dataset, KonIQ-10k, which was used as the experimental dataset. The experimental results show that with slight degradation in object detection performance and simple IQA heads, the values of PLCC and SRCC were 0.785 and 0.727, respectively, and exceeded those of some deep learning-based IQA models that are specially designed for only performing IQA. With the negligible increase in the computation and complexity of object detection and without a decrease in inference speeds, DETR-IQA can perform object detection and IQA via multi-tasking and substantially reduce the workload.
- Conference Article
16
- 10.1117/12.2293240
- Mar 7, 2018
Computed Tomography (CT) is one of the most important medical imaging modality. CT images can be used to assist in the detection and diagnosis of lesions and to facilitate follow-up treatment. However, CT images are vulnerable to noise. Actually, there are two major source intrinsically causing the CT data noise, i.e., the X-ray photo statistics and the electronic noise background. Therefore, it is necessary to doing image quality assessment (IQA) in CT imaging before diagnosis and treatment. Most of existing CT images IQA methods are based on human observer study. However, these methods are impractical in clinical for their complex and time-consuming. In this paper, we presented a blind CT image quality assessment via deep learning strategy. A database of 1500 CT images is constructed, containing 300 high-quality images and 1200 corresponding noisy images. Specifically, the high-quality images were used to simulate the corresponding noisy images at four different doses. Then, the images are scored by the experienced radiologists by the following attributes: image noise, artifacts, edge and structure, overall image quality, and tumor size and boundary estimation with five-point scale. We trained a network for learning the non-liner map from CT images to subjective evaluation scores. Then, we load the pre-trained model to yield predicted score from the test image. To demonstrate the performance of the deep learning network in IQA, correlation coefficients: Pearson Linear Correlation Coefficient (PLCC) and Spearman Rank Order Correlation Coefficient (SROCC) are utilized. And the experimental result demonstrate that the presented deep learning based IQA strategy can be used in the CT image quality assessment.
- Book Chapter
- 10.1007/978-3-319-63754-9_29
- Oct 15, 2017
Our proposal is to present a Blind and Reference Image Quality Assessment or CBPF-IQA. Thus, the main proposal of this paper is to propose an Interface, which contains not only a Full-Reference Image Quality Assessment (IQA) but also a No-Reference or Blind IQA applying perceptual concepts by means of Contrast Band-Pass Filtering (CBPF). Then, this proposal consists, in contrast, a degraded input image with the filtered versions of several distances by a CBPF, which computes some of the Human Visual System (HVS) variables. If CBPF-IQA detects only one input, it performs a Blind Image Quality Assessment, on the contrary, if CBPF-IQA detects two inputs, it considers that a Reference Image Quality Assessment will be computed. Thus, we first define a Full-Reference IQA and then a No-Reference IQA, which correlation is important when is contrasted with the psychophysical results performed by several observers. CBPF-IQA weights the Peak Signal-to-Noise Ratio by using an algorithm that estimates some properties of the Human Visual System. Then, we compare \({\mathrm {CB_{p}F}}\)-IQA algorithm not only with the mainstream estimator in IQA and PSNR but also state-of-the-art IQA algorithms, such as Structural SIMilarity (SSIM), Mean Structural SIMilarity (MSSIM), and Visual Information Fidelity (VIF). Our experiments show that the correlation of CBPF-IQA correlated with PSNR is important, but this proposal does not need imperatively the reference image in order to estimate the quality of the recovered image.