Omniscient Video Super-Resolution with Explicit-Implicit Alignment

Peng Yi,Laigan Luo,Kui Jiang,Zhongyuan Wang,Zheng He,Junjun Jiang,Jiayi Ma,Tao Lu

doi:10.1145/3640346

Abstract

When considering the temporal relationships, most previous video super-resolution (VSR) methods follow the iterative or recurrent framework. The iterative framework adopts neighboring low-resolution (LR) frames from a sliding window, while the recurrent framework utilizes the output generated in the previous SR procedure. The hybrid framework combines them but still cannot fully leverage the temporal relationships. Meanwhile, the existing methods are limited in the receptive field of the optical flow or lack semantic constrains on motion information. In this work, we propose an omniscient framework to fully explore the temporal relationships in the video, which encompasses both LR frames and SR outputs from the past, present, and future. The omniscient framework is more generic because the iterative, recurrent, and hybrid frameworks can be regarded as its special cases. Besides, when addressing the motion information, most previous VSR methods adopt the explicit motion estimation and compensation, while many recent methods turn to implicit alignment. In implicit alignment methods, because basic non-local means suffers from heavy computational costs, we improve it by capturing the non-local correlations in a relatively local manner to reduce the complexity. Moreover, we integrate the explicit and implicit methods into an explicit-implicit alignment module to better utilize motion information. We have conducted extensive experiments on public datasets, which show that our method is superior over the state-of-the-art methods in objective metrics, subjective visual quality, and complexity. In particular, on datasets of Vid4 and UDM10, our method improves PSNR by 0.19 dB, 0.49 dB against the most advanced method BasicVSR++, respectively.

Full Text