Abstract

The goal of video super-resolution technique is to address the problem of effectively restoring high-resolution (HR) videos from low-resolution (LR) ones. Previous methods commonly used optical flow to perform frame alignment and designed a framework from the perspective of space and time. However, inaccurate optical flow estimation may occur easily which leads to inferior restoration effects. In addition, how to effectively fuse the features of various video frames remains a challenging problem. In this paper, we propose a Local-Global Fusion Network (LGFN) to solve the above issues from a novel viewpoint. As an alternative to optical flow, deformable convolutions (DCs) with decreased multi-dilation convolution units (DMDCUs) are applied for efficient implicit alignment. Moreover, a structure with two branches, consisting of a Local Fusion Module (LFM) and a Global Fusion Module (GFM), is proposed to combine information from two different aspects. Specifically, LFM focuses on the relationship between adjacent frames and maintains the temporal consistency while GFM attempts to take advantage of all related features globally with a video shuffle strategy. Benefiting from our advanced network, experimental results on several datasets demonstrate that our LGFN can not only achieve comparative performance with state-of-the-art methods but also possess reliable ability on restoring a variety of video frames. The results on benchmark datasets of our LGFN are presented on https://github.com/BIOINSu/LGFN and the source code will be released as soon as the paper is accepted.

Highlights

  • As one of the fundamental sub-tasks of video enhancement, video super-resolution (VSR) aims at mapping the lowresolution (LR) videos into corresponding high-resolution (HR) ones

  • WORK In this paper, we propose a novel local-global fusion network (LGFN) for high quality video super-resolution

  • The local fusion module focuses on combining adjacent features and maintaining the temporal consistency while the global fusion module figures out how to fully take advantage of all input frames

Read more

Summary

INTRODUCTION

As one of the fundamental sub-tasks of video enhancement, video super-resolution (VSR) aims at mapping the lowresolution (LR) videos into corresponding high-resolution (HR) ones. One of the branches, called Local Fusion Module (LFM), is designed to combine information between adjacent frames so that the generated features can maintain close correlation and temporal consistency. Liu et al [28] put forward a locally connected video super-resolution (LCVSR) approach, which utilizes a similar tactic to perform motion compensation and later refines the features through a refinement network. In our work, we will utilize the deformable convolution v2 [43] for the alignment module, and further improve the performance by our proposed decreased multi-dilation convolution unit (DMDCU), which obtains more accurate results and keeps the extra computational cost relatively low. The other branch, called global fusion module (GFM), aiming at sufficiently utilizing the information of all the input frames, produces the global fused features {Gt−N , .

ALIGNMENT MODULE
1: Initialize network parameter
GLOBAL FUSION MODULE
ABLATION STUDY
Findings
CONCLUSION AND FUTURE WORK
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.