Deep learning has enabled image style transfer to make great strides forward. However, unlike many other styles, transferring the watercolor style to portraits is significantly challenging in image synthesis and style transfer. Pixel-correlation-based methods do not produce satisfactory watercolors. This is because portrait watercolors exhibit the sophisticated fusion of various painting techniques in local areas, which poses a problem for convolutional neural networks to accurately handle fine-grained features. Moreover, the common but problematic way of coping with multiple scales greatly impedes the performance of existing style transfer methods with fixed receptive fields. Although it is possible to develop an image processing pipeline mimicking various watercolor effects, such algorithms are slow and fragile, especially for inputs of different scales. As a remedy, this paper proposes WCGAN, a generative adversarial network (GAN) architecture dedicated to watercolorization of portraits. Specifically, a novel localized style loss suitable for watercolorization is proposed to deal with local details. To handle portraits of different scales and improve robustness, a novel discriminator architecture with three parallel branches of varying sizes of receptive fields is introduced. In addition, the application of WCGAN is expanded to video style transfer where a novel kind of video training data based on random crops is developed to efficiently capture temporal consistency. Extensive experimental results from qualitative and quantitative analyses demonstrate that WCGAN generates state-of-the-art, high quality watercolors from portraits.
Read full abstract