The burgeoning popularity of cloud gaming makes it critical for efficient video compression to relieve the growing bandwidth pressure. While existing neural video coding approaches have demonstrated strong compression potential on natural videos, there is an absence of efficient neural codecs dedicated to gaming videos. To bridge this gap, in this paper, we propose an end-to-end neural video compression method designed specifically for cloud gaming videos. By effectively utilizing the unique camera motion information inherent to cloud gaming, the previous reconstructed frame is maximally aligned to the current frame through a learning-based module with multiple losses, which then replaces the previous reconstructed frame for optical flow estimation. By significantly reducing the displacement between two consecutive frames caused by camera motion, the motion estimation accuracy is enhanced, effectively handling the large and abrupt motion scenarios frequently present in gaming videos. Furthermore, the aligned tensor obtained in the previous step is used to enhance the latent prior of the entropy model, providing a superior temporal prior for coding. Extensive experimental results demonstrate the superior performance of our proposed method compared to one of the previous state-of-the-art approaches, DCVC-HEM, providing significant progress in end-to-end neural compression in cloud gaming videos.
Read full abstract