Abstract

In the context of convolutional neural network (CNN)-based video compressions, motivated by the lower acuity of the human visual system for color differences when compared with luma, we investigate a video compression framework using autoencoder networks that encode and decode videos by using less chroma information than luma information. For this purpose, instead of converting Y’CbCr 4:2:2/4:2:0 videos to and from RGB 4:4:4 as per the current state of the art, we have kept the video in Y’CbCr 4:2:2/4:2:0 and merged the luma and chroma channels after the luma is downsampled to match the chroma size. We have performed an inverse function for the decoder. The performance of our models against the 4:4:4 baseline is evaluated by using color peak signal-to-noise ratio (CPSNR), multiscale structural similarity (MS-SSIM), and video multimethod assessment fusion (VMAF) metrics. Our experiments reveal that, when compared to video compression involving conversion to and from RGB 4:4:4, the proposed method increases the video quality by about 5% for Y’CbCr 4:2:2 and 6% for Y’CbCr 4:2:0 while reducing the amount of computation by nearly 37% for Y’CbCr 4:2:2 and 40% for Y’CbCr 4:2:0. These results point us to optimization for 4:2:2 and 4:2:0 video of the current state-of-the-art autoencoder.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call