Abstract

Almost all compression algorithms try to minimize the one or other type of visual redundancy present in the image. Compression becomes challenging while considering the preservation of contextual information and other information. Without considering the contextual information, some of the unwanted features are also learned by the learning-based methods, which leads to the wastage of computational resources. Motivated by this fact, we propose an attention mechanism guided multi-size kernel convolution network-based image compression-decompression algorithm, which focuses on important (local and global) features that are needed for better reconstruction. Among various feature maps obtained after convolution at any stage, channel attention focuses on “what” is meaningful, and spatial attention focuses on “where” the important features are present in the entire feature map. Secondly, we propose to use a perceptual loss function for the task of image compression, which is a combination of contextual, style, and ℓ-2 loss functions. The proposed network and training it with perceptual loss function helped achieve significant improvements when tested with various datasets like CLIC 2019, Tecnick, Kodak, FDDB, ECSSD, and HKU-IS datasets. When assessed on CLIC 2019 challenging dataset, the MS-SSIM and PSNR of the proposed algorithm outperformed JPEG, JPEG2000, and BPG by approximately up to 49.6%, 34.61%, 20.69%, and 10.79%, 1.32%, 3.36% respectively, at low-bit rates (around 0.1 bpp). We further investigated the effectiveness of the proposed algorithm on the cartoon images and found them to be superior to other algorithms. Lastly, as the cartoon images are significantly less available for experimentation using deep learning algorithms, we propose a cartoon image dataset, namely CARTAGE.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call