TSGAN: A two-stage interpretable learning method for image cartoonization

Yongsheng Dong,Luoyi Li,Lintao Zheng

doi:10.1016/j.neucom.2024.127864

Abstract

Interpreting style transfer methods and generating high-quality style images are two challenging computer vision tasks. However, most of the current image style transfer methods are inexplicable, and their image cartoonilation performance is also not satisfactory due to the complex lines and rich abstract features of cartoon style. To alleviate these two issues, in this paper we propose a novel two-stage interpretable learning method, the two-stage generative adversarial network (TSGAN), for image cartoonization. Particularly, we first divide the generative model into a content learning stage and a stylization stage. The advantages are twofold. The first is that the finely differentiated two-stage image generation model has better interpretability and easy understanding. The second is that TSGAN can adjust the content and style details of the generated image. We further propose a Cartoon Image Enhance (CIE) module for dynamically sampling salient cartoon texture details from training data to generate cartoon images with higher quality. Experimental results show that our TSGAN is effective when compared with four representative methods in terms of visual, qualitative, and quantitative comparisons and user research.

Full Text