Abstract

This paper presents a novel model family that we call SUPERVEGAN, for the problem of video enhancement for low bitrate streams by simultaneous video super resolution and removal of compression artifacts from low bitrates (e.g. 250Kbps). Our strategy is fully end-to-end, but we upsample and tackle the problem in two main stages. The first stage deals with removal of streaming compression artifacts and performs a partial upsampling, and the second stage performs the final upsampling and adds detail generatively. We also use a novel progressive training strategy for video together with the use of perceptual metrics. Our experiments shown resilience to training bitrate and we show how to derive real-time models. We also introduce a novel bitrate equivalency test that enables the assessment of how much a model improves streams with respect to bitrate. We demonstrate efficacy on two publicly available HD datasets, LIVE-NFLX-II and Tears of Steel (TOS). We compare against a range of baselines and encoders and our results demonstrate our models achieve a perceptual equivalence which is up to two times over the input bitrate. In particular our 4X upsampling outperforms baseline methods on the LPIPS perceptual metric, and our 2X upsampling model also outperforms baselines on traditional metrics such as PSNR.

Highlights

  • Two important computer vision problems that benefit from the ability of Generative Adversarial Networks (GANs) to work with little input data are: 1) Video Super Resolution (VSR) and 2) Video Enhancement (VE) such as when the video has gained artifacts, lost sharpness or color depth from high levels of compression

  • We develop deep learning models for the joint Video super resolution (VSR) and VE problem, as well as evaluation methodologies to allow the generation of high resolution and high quality videos that originally been affected by video compression

  • We focus on GAN-based models and demonstrate how they can be successfully applied to the problem of joint video super resolution and artifact removal for videos compressed at low bitrates

Read more

Summary

Introduction

Two important computer vision problems that benefit from the ability of Generative Adversarial Networks (GANs) to work with little input data are: 1) Video Super Resolution (VSR) and 2) Video Enhancement (VE) such as when the video has gained artifacts, lost sharpness or color depth from high levels of compression. Of these problems, VSR has been studied more widely. Constrained bandwidth affects live video transmission where no advance encoding or buffering is possible and is accentuated by the spread of IoT devices, home security cameras, video conferencing and streaming cameras from mobile devices. Video is already the most ubiquitous type of data transmitted with IP video currently using over 82% of the overall global IP traffic, and with the average internet household expected to have generated over 117.8 gigabytes of Internet traffic per month in 2020 [1]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.