Abstract

3D face reconstruction has witnessed considerable progress in recovering 3D face shapes and textures from in-the-wild images. However, due to a lack of texture detail information, the reconstructed shape and texture based on deep learning could not be used to re-render a photorealistic facial image since it does not work in harmony with weak supervision only from the spatial domain. In the paper, we propose a method of spatio-frequency decoupled weak-supervision for face reconstruction, which applies the losses from not only the spatial domain but also the frequency domain to learn the reconstruction process that approaches photorealistic effect based on the output shape and texture. In detail, the spatial domain losses cover image-level and perceptual-level supervision. Moreover, the frequency domain information is separated from the input and rendered images, respectively, and is then used to build the frequency-based loss. In particular, we devise a spectrum-wise weighted Wing loss to implement balanced attention on different spectrums. Through the spatio-frequency decoupled weak-supervision, the reconstruction process can be learned in harmony and generate detailed texture and high-quality shape only with labels of landmarks. The experiments on several benchmarks show that our method can generate high-quality results and outperform state-of-the-art methods in qualitative and quantitative comparisons.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.