Abstract

For more efficient inter prediction in quality scalable high efficiency video coding (SHVC), a learning-based framework for generating a virtual reference frame (VRF) is proposed in this letter. In our method, reconstructed base layer (BL) and enhancement layer (EL) frames are employed to make the generated VRF and current EL frame as same as possible. To this end, this letter proposes a novel VRF generation convolutional neural network (VRFCNN) to jointly handle enhancement of corresponding BL frame and compensation of previous EL frame. Specifically, the VRFCNN consists of BL enhancement, EL compensation and feature fusion subnets. Previous EL frames are firstly compensated with the learned coarse flow between two adjacent BL frames and then adopted to provide interlayer information for corresponding BL enhancement. The learned finer flow between the EL and enhanced BL features is adopted to provide temporal information for previous EL compensation. For efficiently handling slow- and fast-motion videos, the enhanced BL and compensated EL features are fused to generate a VRF. Experimental results show that VRFCNN averagely achieves 11.8% BD-rate reduction under low delay P configuration, which outperforms other methods. The code of our VRFCNN approach is available at https://github.com/dq0309/VRFCNN.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call