Abstract

Joint 3D face reconstruction and dense alignment based on a single image has always been a challenge in computer vision. Recently, some work achieves these two goals simultaneously by adopting the position map to represent a 3D face. In this paper, we design a novel upsampling block, denoted as shortcut-upsampling block, to construct the position map. We first utilize the residual blocks to extract and downsample the feature maps from the 2D face image, and then utilize the proposed shortcut-upsampling blocks to convert the feature maps to the corresponding position map. Shortcut-upsampling block allows our model to be only 121M and achieve excellent performance on 3D face reconstruction and dense alignment. Compared with other deep learning methods based on 3D Morphable Model(3DMM), our model is the fastest when reconstructing a 3D face. In addition, through introducing a dynamic weight loss function, our model can converge the loss to a lower value and obtain a better performance on 3D face reconstruction.

Highlights

  • Three-dimensional facial reconstruction and face alignment based on a single view have always been two hot spots in computer vision, which affected by many factors such as large poses, illuminations, and occlusions

  • The 300W-LP dataset contains about 60k 3D face models, each of which corresponds to a 2D face image

  • Compared with the PRN we trained in the same experimental environment, our network achieves breakthroughs in both 3D face reconstruction and dense face alignment

Read more

Summary

INTRODUCTION

Three-dimensional facial reconstruction and face alignment based on a single view have always been two hot spots in computer vision, which affected by many factors such as large poses, illuminations, and occlusions. The researchers working on face alignment usually use the face datasets labeled in 2D space to train the face model and predict the landmarks. Both Position Map Regression Network(PRN) [3] and Volumetric Regression Network(VRN) [4] are the modelfree methods, their essence is to directly regress each point on the 3D face They do not discard the context information of the points in the 3D face and are not restricted by the model. Reference [19] is an outstanding machine learning algorithm, which achieved an excellent standard of running time and landmarks locating accuracy, but its robustness to face images is obviously insufficient compared with the recent methods based on neural network.

SHORTCUT-UPSAMPLING BLOCK AND TRAINING NETWORK
DYNAMIC WEIGHT LOSS FUNCTION
TRAINING DETAILS AND TEST DATASETS
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.