Abstract
Quality enhancement of HEVC compressed videos has attracted a lot of attentions in recent years. In this article, we propose a robust multi-frame guided attention network (MGANet) to reconstruct high-quality frames based on HEVC compressed videos. In our network, we first use an advanced motion flow algorithm to estimate the motion information of input frames so as to guide the warping of adjacent frames. After performing the alignment, we find that large residuals still appear in the edge area of moving objects of the warped frames. Then, we design a temporal encoder based on a bi-directional convolutional long short term memory (ConvLSTM) with residual structure to further discover the variations between the current frame and its adjacent warped frames. Finally, we feed the extracted temporal information and a partitioned average image (PAI) to a multi-scale guided encoder-decoder subnet to reconstruct high-quality frames. Here, each PAI is generated according to the transform unit (TU) partitioning map that can be extracted directly from the coded bit-streams, thus enabling our network to focus on the TU boundaries while optimizing the global content. We present extensive experimental results to demonstrate the robustness of our method, especially for the high bit-rate coding case and large motion scenes. Due to the lightweight design structure, our proposed MGANet also has a very competitive inference time.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have