Abstract

High-Efficiency Video Coding provides a better compression ratio compared to earlier standard, H.264/Advanced Video Coding. In fact, HEVC saves 50% bit rate compared to H.264/AVC for the same subjective quality. This improvement is notably obtained through the hierarchical quadtree structured Coding Unit. However, the computational complexity significantly increases due to the full search Rate-Distortion Optimization, which allows reaching the optimal Coding Tree Unit partition. Despite the many speedup algorithms developed in the literature, the HEVC encoding complexity still remains a crucial problem in video coding field. Towards this goal, we propose in this paper a deep learning model-based fast mode decision algorithm for HEVC intermode. Firstly, we provide a deep insight overview of the proposed CNN-LSTM, which plays a kernel and pivotal role in this contribution, thus predicting the CU splitting and reducing the HEVC encoding complexity. Secondly, a large training and inference dataset for HEVC intercoding was investigated to train and test the proposed deep framework. Based on this framework, the temporal correlation of the CU partition for each video frame is solved by the LSTM network. Numerical results prove that the proposed CNN-LSTM scheme reduces the encoding complexity by 58.60% with an increase in the BD rate of 1.78% and a decrease in the BD-PSNR of -0.053 dB. Compared to the related works, the proposed scheme has achieved a best compromise between RD performance and complexity reduction, as proven by experimental results.

Highlights

  • Nowadays, there is the emerging technology of new generation digital media and the rapid development of multimedia applications, such as HD and UHD surveillance camera applications in smart city, and the speedy growth of the smart connected devices (IoT) that stream video in a real-time manner. us, its popularity has drawn attention from both industry and the academic community

  • This paper proposes a deep learning tool that reduces High-Efficiency Video Coding (HEVC) complexity in terms of encoding time and RD performances. e main contribution consists of a structural combination between the CNN and the LSTM networks. e former is proposed to predict Coding Unit (CU) splitting and to reduce the performance of HEVC encoding

  • We develop an LSTM network to study the CU partition correlation at intercoding. is is because the deep CNN proposed in [9] does not explore the temporal information of CU partition for each HEVC frame. en, we combine a CNN-LSTM learning scheme to predict the intercoding CU splitting, which reduces the computational complexity of HEVC [9]

Read more

Summary

Introduction

There is the emerging technology of new generation digital media and the rapid development of multimedia applications, such as HD and UHD surveillance camera applications in smart city, and the speedy growth of the smart connected devices (IoT) that stream video in a real-time manner. us, its popularity has drawn attention from both industry and the academic community. The authors in [9] developed a machine learning tool in order to predict the CU mode partition, which provides a good tradeoff between encoding time and RD performance. All these approaches did not model the temporal correlation in video frames at intercoding. In this light, this paper proposes a deep learning tool that reduces HEVC complexity in terms of encoding time and RD performances. At is how the CNNLSTM-based learning approach is proposed, which predicts the intercoding CU partition, instead of the classical RDO search.

Related Work Overview
Proposed Framework Based on Deep Learning
Experimental Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call