Abstract

A deep learning method called PTR-CNN (Predicted frame with Transform unit partition and prediction Residual aided CNN) is proposed for in-loop filtering in video compression. To reduce the computational complexity of an end-to-end CNN in-loop filter, a non-learning method of reference frame selection is designed to select the highest quality frame based on the frame’s blurriness and smoothiness scores. The transform unit (TU) partition and the prediction residual (PR) of the current frame are used as extra inputs to the neural network as the filtering guidance. The selected similar and high quality reference frame (RF) and the current unfiltered frame (CUF) are input to a CNN based motion compensation module to generate a predicted frame (PF). Finally input the PF, the CUF, the CUF’s TU partition and the CUF’s PR into the main CNN to reconstruct the filtered frame. The model is implemented in Tensorflow and tested in HEVC and AV1. Experimental results show that the complexity of proposed PTR-CNN is less than SOTA CNN-based reference aided in-loop filtering methods and slightly outperforms their RD performance. The scheme introduces a complexity overhead of 7% on the encoder. In particular, for random access, the proposed model achieves 11.78% coding gain over HEVC with DBF/SAO off, while has a gain of 4.76% over HEVC with DBF/SAO on. Ablation study demonstrates that the RF contributes about 10% of the total gain, and the TU and PR contribute over 4% of the total one, proving the effectiveness of each module. Moreover, it is observed that the proposed method can restore detailed structures and textures and hence improve the subjective quality.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call