Abstract

The great success of deep learning has boosted the fast development of video quality enhancement. However, existing methods mainly focus on enhancing the objective quality of compressed video, and ignore their perceptual quality that plays a key role in determining quality of experience (QoE) of videos. In this paper, we aim at enhancing the perceptual quality of compressed video. Our main observation is that perceptual quality enhancement mostly relies on recovering the high-frequency details with fine textures. Accordingly, we propose a novel generative adversarial network (GAN) based on multi-level wavelet packet transform (WPT), which is called multi-level wavelet-based GAN+ (MW-GAN+), to exploit high-frequency details for enhancing the perceptual quality of compressed video. In MW-GAN+, we first propose a multi-level wavelet pixel-adaptive (MWP) module to extract temporal information across video frames, such that frame similarity can be utilized in recovering high-frequency details. Then, a wavelet reconstruction network, consisting of wavelet-dense residual blocks (WDRB), is developed to recover high-frequency details in a multi-level manner for enhanced frame reconstruction. Finally, we develop a 3D discriminator to encourage temporal coherence with a 3D-CNN based architecture. Experimental results demonstrate the superiority of our method over state-of-the-art methods in enhancing the perceptual quality of compressed video. Our code is available at <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><uri>https://github.com/IceClear/MW-GAN</uri></i> .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call