To monitor objects of interest, such as wildlife and people, image-capturing devices are used to collect a large number of images with and without objects of interest. As we are recording valuable information about the behavior and activity of objects, the quality of images containing objects of interest should be better than that of images without objects of interest, even if the former exhibits more severe distortion than the latter. However, according to current methods, quality assessments produce the opposite results. In this study, we propose an end-to-end model, named DETR-IQA (detection transformer image quality assessment), which extends the capability to perform object detection and blind image quality assessment (IQA) simultaneously by adding IQA heads comprising simple multi-layer perceptrons at the top of the DETRs (detection transformers) decoder. Using IQA heads, DETR-IQA carried out blind IQAs based on the weighted fusion of the distortion degree of the region of objects of interest and the other regions of the image; the predicted quality score of images containing objects of interest was generally greater than that of images without objects of interest. Currently, the subjective quality score of all public datasets is in accordance with the distortion of images and does not consider objects of interest. We manually extracted the images in which the five predefined classes of objects were the main contents of the largest authentic distortion dataset, KonIQ-10k, which was used as the experimental dataset. The experimental results show that with slight degradation in object detection performance and simple IQA heads, the values of PLCC and SRCC were 0.785 and 0.727, respectively, and exceeded those of some deep learning-based IQA models that are specially designed for only performing IQA. With the negligible increase in the computation and complexity of object detection and without a decrease in inference speeds, DETR-IQA can perform object detection and IQA via multi-tasking and substantially reduce the workload.
Read full abstract