Abstract

Learning to synthesize free-hand sketches controllably according to specified categories and sketching styles is a challenging task, due to the lack of training data with category labels and style labels. One choice to control the synthesis is by self-organizing a latent coding space to preserve the similarity of structural patterns of the observed data. A practical way is introducing a Gaussian mixture prior over the latent codes, where each Gaussian component represents a specific categorical or stylistic pattern. As a result, we can generate sketches by sampling the latent variables from the Gaussian components or continuously manipulating the latent representations by interpolation. To achieve robust controllable sketch synthesis, it is critical to determine an appropriate Gaussian number. An underestimated Gaussian number cannot fully represent all the sketch patterns, i.e., some clusters have to contain sketches with more than one pattern. An overestimated one introduces redundant components, usually representing a chaotic collection of sketches with diverse patterns featured by other components. Both cases disturb pattern clustering over the coding space and make the internal code generation difficult to control for specific patterns. However, the Gaussian number is unavailable in this unsupervised task. In this paper, we present Rival Penalized Competitive Learning pixel to sequence (RPCL-pix2seq) to automatically determine the Gaussian number. Both quantitative and qualitative experimental results show RPCL-pix2seq can partition the codes for the sketches into an approximate stable number of clusters. Hence, we are able to do synthesis reasoning over the latent space, generating novel but reasonable sketches which neither appear in the training dataset nor exist in real life.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call