Abstract

Fractional motion compensation (FMC) has been widely adopted in modern video coding standards such as high-efficiency video coding (HEVC). The fractional pixels in FMC are generated by the interpolation filters, and the coefficients of the filters are designed using a Fourier decomposition of the discrete cosine transform (DCT) in HEVC. To overcome the compression deficiency caused by these handcrafted filters, deep learning-based interpolation filter approaches have been introduced recently. In this paper, we analyze the drawbacks of the mean square error (MSE) only loss function which is widely adopted in existing learned-based approaches, and introduce the rate-distortion optimization (RDO) theory into network training to formulate a joint loss function to overcome this issue. Then, a new training scheme is proposed for the detailed implementation of the joint loss function. Besides the calculated MSE loss, the residual will be sent into a rate estimation model which includes three stages: transform/quantization/rate modeling, and the output rate cost will be used as the second part of the loss function. Finally, we propose a novel network structure that utilizes the deformable convolution layer and the residual dense block to enhance the feature extraction and transmission ability to adapt to complex video contents. The proposed RDO-based interpolation filter (RDOIF) has been integrated into HEVC reference software HM-16.7, and extensive experiments are conducted to verify the efficiency of the proposed method. Experimental results show that the proposed RDOIF method achieves 3.1%, 2.4%, and 1.4% BD-rate reduction on average under the low-delay-P, low-delay-B, and random access configurations, respectively. When implemented in the versatile video coding (VVC) test model VTM-10.0, the proposed method can achieve 2.3%, 1.2%, and 0.5% BD-rate savings on average under the low-delay-P, low-delay-B, and random access configurations, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call