For Chinese font images, when all their strokes are replaced by pattern elements such as flowers and birds, they become flower–bird character paintings, which are traditional Chinese art treasures. The generation of flower–bird painting requires professional painters’ great efforts. How to automatically generate these paintings from font images? There is a huge gap between the font domain and the painting domain. Although many image-to-image translation frameworks have been proposed, they are unable to handle this situation effectively. In this study, a novel method called font-to-painting network (F2PNet) is proposed for font-to-painting translation. Specifically, an encoder equipped with dilated convolutions extracts features of the font image, and then the features are fed into the domain translation module for mapping the font feature space to the painting feature space. The acquired features are further adjusted by the refinement module and utilised by the decoder to obtain the target painting. The authors apply adversarial loss and cycle-consistency loss to F2PNet and further propose a loss term, which is called recognisability loss and makes the generated painting have font-level recognisability. It is proved by experiments that F2PNet is effective and can be used as an unsupervised image-to-image translation framework to solve more image translation tasks.