Image colorization has a wide range of applications, but it remains a challenging task due to it is an inherently ill-posed problem with multi-modal uncertainty. The advancement of deep learning techniques has provided extensive avenues for addressing image colorization. However, current works mainly suffer from two problems: inaccurate colorization leading to biased color tones (e.g., cool or warm bias) and undersaturation of images. Existing Transformer-based methods can produce impressive results, but they often come with high training costs and may result in color overflow effects. In this paper, we propose a two-stage image colorization strategy based on a color codebook. Clustering methods in the three-dimensional CIE Lab color space is proposed to integrate brightness information so that the colors in the codebook can be lifelike. In the first stage, we treat the colorization task as a classification problem based on a color codebook, and a high-quality codebook is advantageous for enhancing color classification accuracy. In the second stage, different from the traditional Transformer-based method, a pyramid-type Transformer structure is used to extract rich image features to refine the colors, which can solve potential color bands, color errors and color overflow. In addition, the parameters and FLOPs are significantly smaller than other traditional Transformer-based methods. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches. On the ImageNet validation set, the achieved values are 4.60, 25.23, 0.19, and 39.82 in terms of FID, PSNR, LPIPS, and CF, respectively. On the COCO-Stuff validation set, the achieved values are 5.62, 25.15, 0.19, and 36.25 in terms of FID, PSNR, LPIPS, and CF, respectively. The codes are available at https://github.com/Tanghui2000/Two-stage_Image_Colorization_via_Color_Codebook.
Read full abstract