Deep image compression efficiency has been improved in the past years. However, to fully exploit context information for compressing image objects of different scales and shapes, more adaptive geometric structure of inputs should be considered. In this paper, we novelly introduce deformable convolution and its spatial attention extension into deep image compression task to fully exploit the context information. Specifically, a novel deep image compression network with Multi-Scale Deformable Convolution and Spatial Attention, named MS-DCSA, is proposed to better extract compact and efficient latent representation as well as reconstruct higher-quality images. First, multi-scale deformable convolution is presented to provide multi-scale receptive fields for learning spatial sampling offsets in deformable operations. Subsequently, multi-scale deformable spatial attention module is developed to generate attention masks to re-weight extracted features according to their importance. In addition, the multi-scale deformable convolution is applied to design delicate up/down sampling modules. Extensive experiments demonstrate that the proposed MS-DCSA network achieves improved performance on both PSNR and MS-SSIM quality metrics, compared to conventional as well as competing deep image compression methods.
Read full abstract