Abstract

A video compression framework based on spatio-temporal resolution adaptation (ViSTRA) is proposed, which dynamically resamples the input video spatially and temporally during encoding, based on a quantisation-resolution decision, and reconstructs the full resolution video at the decoder. Temporal upsampling is performed using frame repetition, whereas a convolutional neural network super-resolution model is employed for spatial resolution upsampling. ViSTRA has been integrated into the high efficiency video coding reference software (HM 16.14). Experimental results verified via an international challenge show significant improvements, with BD-rate gains of 15% based on PSNR and an average MOS difference of 0.5 based on subjective visual quality tests.

Highlights

  • W ITH the ever increasing demand for more immersive visual experiences, video content providers have been extending the video parameter space by using higher spatial resolutions, frame rates and dynamic ranges

  • Inspired by our previous work on quality assessment [1], [2], [12], [13] and spatial resolution adaptation for intra coding [14], we propose a spatio-temporal resolution adaptation framework for video compression, ViSTRA, which dynamically predicts the optimal spatial and temporal resolutions for the input video during encoding and attempts to reconstruct the full resolution video at the decoder

  • The proposed framework was integrated into HEVC test model HM 16.14 and was submitted to the Grand Challenge on Video Compression Technology at the International Conference on Image Processing (ICIP) 2017 [15]

Read more

Summary

INTRODUCTION

W ITH the ever increasing demand for more immersive visual experiences, video content providers have been extending the video parameter space by using higher spatial resolutions, frame rates and dynamic ranges. By dynamically predicting these parameters, bitrates could be significantly reduced while maintaining equivalent perceptual video quality In this context, several authors have proposed reducing spatial resolution for low bitrate encoding [3], [4], but lack a reliable adaptation technique. The experimental results presented here, are based on test sequences used in the Video Compression Grand Challenge at IEEE ICIP 2017 [15] These show substantial coding gains, on average 14.5% BD-rate (PSNR) and 0.52 average MOS difference (from independent subjective test), compared to the original HEVC anchor codec (HM 16.14). The remainder of this paper is organised as follows: Section II describes the proposed framework; Section III provides more detail into the design of the QRO module; Section IV describes the employed methods for spatial and temporal resolution resampling; Section V presents and discusses the experimental design and results, and section VI provides conclusions and ideas for future work

PROPOSED FRAMEWORK
Temporal Decisions
Spatial Decisions
Temporal Resampling
Spatial Resampling
EXPERIMENTAL RESULTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call