Abstract

3D face reconstruction from a single image is a vital task in various multimedia applications. A key challenge for 3D face shape reconstruction is to build the correct dense face correspondence between the monocular input face and the deformable mesh. Most existing methods rely on shape labels fitted by traditional methods or strong priors such as multi-view geometry consistency. In contrast, we propose an innovative 3D Modulated Morphable Model (3D3M) to learn the dense shape correspondence from monocular images in a self-supervised manner. Specifically, given a batch of input faces, 3D3M encodes their 3DMM attributes (shape, texture, lighting, etc.) and then randomly shuffles the 3DMM attributes to generate the attribute-changed faces. The attribute-changed faces can be encoded and rendered back in a cycle-consistent manner, which enables us to utilize the self-supervised consistencies in dense mesh vertices and reconstructed pixels. The dense shape and pixel correspondence enable us to adopt a series of self-supervised constraints to fit the 3D face model accurately and learn the per-vertex correctives end-to-end. 3D3M builds excellent high-quality 3D face reconstruction results from monocular images. Both quantitative and qualitative experimental results have verified the superiority of 3D3M over prior arts on 3D face reconstruction and face alignment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call