Abstract

Creating a novel font set requires domain expertise and is a laborious and time-consuming process, particularly for languages with a large number of characters and complicated structures. Existing deep learning based methods consider font generation (FG) as an image-to-image translation problem, mostly in a supervised setting, either in the form of pair images (paired data) or font labels (character or style labels), which requires extensive effort and is expensive to collect. Additionally, these supervised counter parts lack generalization for extending to other text image-related tasks, such as word image generation and font attribute control at inference time. We found that these drawbacks are mainly due to the supervised setting adopted by these existing methods for font generation. In this paper, we tackle the FG problem in a truly unsupervised fashion, where a complete font set can be generated by training the generator such that adjacent styles are not correlated and projecting the input glyph image into its corresponding font style latent space. To accomplish this, we propose the Font Mixing Generative Adversarial Network (FM-GAN), which employs mixing regularization to supervise the generator to localize the font styles, and a projection encoder to project an arbitrary glyph image into its corresponding semantic space that is compatible with the generator. In the experiments, we demonstrated that our unsupervised model synthesizes font images that are comparable to supervised state-of-the-art FG baselines. Furthermore, FM-GAN can be directly applied to other text image related tasks, such as multi-lingual font style transfer, word image generation, and font attribute control.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call