Abstract

Perceived quality assessment for user-generated content (UGC) videos is of considerable importance to safeguard the viewing experience of end-users. The diversity of content and the blend of authentic distortions pose great challenges for UGC video quality assessment (UGC-VQA). The reverse hierarchy theory suggests that there is bottom-up feedforward perception and top-down feedback perception in the human visual system (HVS). However, existing UGC-VQA methods rarely consider feedback perception and make it difficult to model the complete visual perception loop, leading to inaccurate prediction of perceived quality. Thus, this paper innovatively proposes a bidirectional hierarchical semantic extraction structure for VQA (BHSE-VQA), which simulates visual feedforward and feedback perception. Specifically, a feedforward and feedback multi-level network is first designed to extract multi-level spatio-temporal features with a 3D-ConvNext backbone in the feedforward pathway and process these hierarchical features with the combination of channel attention and spatial attention mechanisms in the feedback pathway. Then, considering the varying impacts of responses at different perception layers on visual perception results, the weights of features at each level are redistributed to be consistent with human perception. With bidirectional hierarchical pathway features, a temporal attention fusion network is introduced to further capture temporal correlations and aggregate features relying on the residuals between feedforward and feedback features for quality prediction. Experimental results on several representative UGC-VQA databases demonstrate the effectiveness of the proposed model and the significance of comprehensive hierarchical perception modeling for UGC-VQA.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.