Multi-Dimensional Feature Fusion Network for No-Reference Quality Assessment of In-the-Wild Videos.

Jiu Jiang,Meng Tian,Hongtai Yao,Xianpei Wang,Bowen Li

doi:10.3390/s21165322

Jiu Jiang, Meng Tian + Show 3 more

Open Access

https://doi.org/10.3390/s21165322

Copy DOI

Journal: Sensors	Publication Date: Aug 6, 2021
Citations: 5	License type: CC BY 4.0

Affiliation: Wuhan University

Abstract

Over the past few decades, video quality assessment (VQA) has become a valuable research field. The perception of in-the-wild video quality without reference is mainly challenged by hybrid distortions with dynamic variations and the movement of the content. In order to address this barrier, we propose a no-reference video quality assessment (NR-VQA) method that adds the enhanced awareness of dynamic information to the perception of static objects. Specifically, we use convolutional networks with different dimensions to extract low-level static-dynamic fusion features for video clips and subsequently implement alignment, followed by a temporal memory module consisting of recurrent neural networks branches and fully connected (FC) branches to construct feature associations in a time series. Meanwhile, in order to simulate human visual habits, we built a parametric adaptive network structure to obtain the final score. We further validated the proposed method on four datasets (CVD2014, KoNViD-1k, LIVE-Qualcomm, and LIVE-VQC) to test the generalization ability. Extensive experiments have demonstrated that the proposed method not only outperforms other NR-VQA methods in terms of overall performance of mixed datasets but also achieves competitive performance in individual datasets compared to the existing state-of-the-art methods.

Full Text