Abstract

Many recent algorithms based on convolutional neural network (CNN) for blind image quality assessment (BIQA) share a common two-stage structure, i.e., local quality measurement followed by global pooling. In this paper, we mainly focus on the pooling stage and propose an attention-based pooling network (APNet) for BIQA. The core idea is to introduce a learnable pooling that can model human visual attention in a data-driven manner. Specifically, the APNet is built by incorporating an attention module and allows for a joint learning of local quality and local weights. It can automatically learn to assign visual weights while generating quality estimations. Moreover, we further introduce a correlation constraint between the estimated local quality and attention weight in the network to regulate the training. The constraint penalizes the case in which the local quality estimation on a region attracting more attention differs a lot from the overall quality score. Experimental results on benchmark databases demonstrate that our APNet achieves state-of-the-art prediction accuracy. By yielding an attention weight map as by-product, our model gives a better interpretability on the learned pooling.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call