Abstract

Multi-view generation from a given single view is a significant, yet challenging problem with broad applications in the field of virtual reality and robotics. Existing methods mainly utilize the basic GAN-based structure to help directly learn a mapping between two different views. Although they can produce plausible results, they still struggle to recover faithful details and fail to generalize to unseen data. In this paper, we propose to learn invariant and uniformly distributed representations for multi-view generation with an “Alignment” and a “Uniformity” constraint (AU-GAN). Our method is inspired by the idea of contrastive learning to learn a well-regulated feature space for multi-view generation. Specifically, our feature extractor is supposed to extract view-invariant representation that captures intrinsic and essential knowledge of the input, and distribute all representations evenly throughout the space to enable the network to “explore” the entire feature space, thus avoiding poor generative ability on unseen data. Extensive experiments on multi-view generation for both faces and objects demonstrate the generative capability of our proposed method on generating realistic and high-quality views, especially for unseen data in wild conditions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call