CNN-based no-reference video quality assessment method using a spatiotemporal saliency patch selection procedure

Sana Alamgeer,Muhammad Irshad,Mylène C Q Farias

doi:10.1117/1.jei.30.6.063001

Abstract

We propose a yet lightweight no-reference (NR) video quality assessment (VQA) method, which uses a convolution neural network (CNN) architecture. The proposed method implements a spatiotemporal saliency patch selection procedure that crops the frame into small nonoverlapping blocks of images (patches) and selects the most perceptually relevant ones. The selected patches are then forwarded to the CNN. To determine which patches are the most relevant, spatial and temporal saliency features are computed for each frame. The proposed method does not require subjective scores to train the CNN. It uses objective quality scores as target quality scores for each video frame, which are computed using an NR image quality assessment method. Given the lack of large annotated video quality databases, this is an advantage of the proposed method. Finally, although it has much smaller cost of data-processing, compared with other state-of-the-art methods, the proposed NR-VQA obtains robust and competitive results.

Full Text