With the development of machine learning, advanced photography and image transmission systems, images are being processed more and more by machines, so image coding for machines (ICM) came into being. After the image codec compresses and transmits the image, the image will be handed over to machine vision task networks. These vision tasks include image classification, semantic segmentation, and so on. We propose a side information-driven image coding for hybrid machine–human vision (SICMH) framework, not only for machine vision tasks, but also for human vision-oriented image reconstruction. The proposed SICMH framework can perform image classification, semantic segmentation, and coarse image reconstruction by using purely the side information. Moreover, SICMH can perform fine image reconstruction by using the residue information. In particular, we propose a multi-scale feature fusion block to enhance the usage of side information, and a novel semantic segmentation network named modified TrSeg to generate better semantic segmentation maps. The experimental results well demonstrated the effectiveness of our proposed framework. SICMH achieves the same image classification and semantic segmentation accuracy as the existing traditional or learning-based multi-task ICM frameworks using the lowest bitrate. For the image reconstruction task, the proposed SICMH achieved the same PSNR as existing learning-based multi-task hybrid ICM frameworks and the traditional image codec BPG again with the lowest bitrate.
Read full abstract