Abstract

The huge data volume of 3D video brings great challenges to the storage and deployment of existing multimedia systems. To reduce the redundancy of 3D video, this paper proposes a global-context aggregated intra prediction network (GCAIP-Net) to improve the performance of depth video coding. In contrast with previous studies only exploiting the local context in reference pixels, the proposed GCAIP-Net takes the advantage of the transformer to explore the global context for more accurate prediction. Specifically, a transformer-based prediction module is designed to generate predicted features by modeling long-range dependencies in the spatial domain. The proposed method is integrated into the 3D-HEVC reference software HTM16.2, and extensive experiments show that the proposed method achieves 3.62% BD-rate reduction compared with 3D-HEVC under all-intra configuration.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call