Abstract

How to use information from temporal, spatial, and frequency domain dimensions is crucial for the quality enhancement of compressed video. The state-of-the-art methods generally design powerful networks to fuse the spatiotemporal information of the videos. But the spatiotemporal information of the entire video is not fully utilized and effectively fused, resulting in the learned context information that is not closely related to the target frame. In addition, various compressed videos have varying degrees of frequency domain information loss. The previous methods ignored the non-uniform distortion of compressed video in different frequency domains and did not design unique algorithms for different frequency domains, so the real texture details of the video could not be restored. In this paper, we propose an omniscient network, which learns video spatiotemporal and omni-frequency information more effectively. The omniscient network consists of two novel components: a Spatio-Temporal Feature Fusion (STFF) module and an Omni-Frequency Adaptive Enhancement (OFAE) block. The former aims to capture spatiotemporal information in adjacent frames, while the latter aims to adaptively recover different frequency domains of compressed video. The information is designed to be bidirectionally propagated in a grid manner such that the omni-enhanced results can be applied. Extensive experiments show that our method outperforms the state-of-the-art method in terms of objective metrics, subjective visual effects, and model complexity.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call