Abstract

Video super-resolution based on CNNs has recently achieved significant progress. The existing popular super-resolution methods usually impose the optical flow between neighboring frames for temporal alignment. However, estimation of accurate optical flow is hard and expends greater computation. To address this problem, we propose a novel grouped spatio-temporal alignment network (GSTAN) that effectively incorporates spatio-temporal information in a hierarchical way. The input sequence is divided into several groups corresponding to different frame rates. These groups provide complementary information, which is helpful to restore missing textures for the reference frame. Specifically, each group employs deformable 3D convolution to incorporate spatio-temporal information, which avoids artifacts from explicit motion estimation. In addition, a Gated-Dconv information filter is proposed to control information flow to focus on the fine details. Finally, these groups provide complementary information, which is integrated with the inter-group fusion module. Extensive experiments have demonstrated our method achieves state-of-the-art SR performance on several benchmark datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.