Abstract

Video super-resolution (VSR) has attracted a lot of attention that converts a low resolution (LR) video into a high resolution (HR) one. The original LR video is typically produced either by the downscaling processing or low-resolution sensor. Considering that the resolution degradation or limitation makes different impacts on different low-frequency (LF) and high-frequency (HF) components of the LR video signal, we propose a Decomposition Oriented Video super-rEsolution (DOVE) method in this paper. More specifically, a three-stream VSR network is designed in which the proposed LF and HF stream is responsible for modeling LF and HF components in the feature space. Moreover, a multi-frame refinement stream takes features of coarsely aligned frames as input and generates finely aligned counterparts progressively to guide the learning of LF and HF streams at the intermediate feature level. Furthermore, non-local channel attention is devised to capture long-range dependencies on a global scale both in the channel domain. Experimental results indicate that separating the learning of LF and HF components helps better estimate the HR frame from LR frames and superior VSR performance is achieved when compared with that of recent state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call