Abstract

Video Super Resolution (VSR) aims to generate high-resolution (HR) frames from corresponding low-resolution (LR) frames. It draws a stark contrast from single image super-resolution (SISR) because of its high temporal dependency on misaligned supporting frames. The existing methods involve using RNNs to learn the temporal dependency by using other networks (CNNs, GANs) for predicting neighboring pixels. Due to the memory and processing constraints and the inference time required for up-scaling LR frames, a wide variety of VSR techniques cannot be applied to mid-range and budget mobile devices. This paper presents VIhanceD, a real-time sliding window-based network that can operate on budget smartphones and laptops while producing cutting-edge results on various video datasets. Our approaches include both spatial and temporal dependencies to make the up-scaled HR frames coherent and free of motion distortions. We focus on enhancing the user experience in areas with internet restrictions due to social, political, and geographical limitations. The mobile app (and PC client) provides a continuous stream of HR frames without buffering. We obtained 33.9 PSNR on REDS Dataset with a single frame inference time of 23.6 ms demonstrating state-of-the-art performance. Our subsequent experiments on VIMEO-90K dataset demonstrates that the suggested method is generalizable and works on natural video frames and textual data, making it suitable for infotainment and multimedia.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call