Abstract
Many facial landmark methods based on convolutional neural networks (CNN) have been proposed to achieve favorable detection results. However, the instability landmarks that occur in video frames due to CNNs are extremely sensitive to input image noise. To solve this problem of landmark shaking, this study proposes a simple and effective facial landmark detection method comprising a lightweight U-Net model and a dynamic optical flow (DOF). The DOF uses the fast optical flow to obtain the optical flow vector of the landmark and uses dynamic routing to improve landmark stabilization. A lightweight U-Net model is designed to predict facial landmarks with a smaller model size and less computational complexity. The predicted facial landmarks are further fed to the DOF approach to deal with the unstable shaking. Finally, a comparison of several common methods and the proposed detection method is made on several benchmark datasets. Experimental evaluations and analyses show that not only can the lightweight U-Net model achieve favorable landmark prediction but also the DOF stabilizing method can improve the robustness of landmark prediction in both static images and video frames. It should be emphasized that the proposed detection system exhibits better performance than others without requiring heavy computational loadings.
Highlights
Over the past decade, several facial image applications have been widely developed such as facial recognition, facial enhancement, head pose estimation, facial expression recognition, face swapping, and face monitoring [1,2,3]
A local neural field patch expert [5], which can learn the similarity of surrounding pixels and the sparsity constraints of pixels, was proposed to solve the problem of matching failure
This study proposes a facial landmark detection method, which comprises a lightweight U-Net model and a dynamic optical flow (DOF)
Summary
Several facial image applications have been widely developed such as facial recognition, facial enhancement, head pose estimation, facial expression recognition, face swapping, and face monitoring [1,2,3]. Facial landmark detection is a very important research topic in these applications but it poses many challenges such as blurred images, extreme lighting, artificial occlusion, extreme head posture, and data imbalance. To tackle these problems, active shape models have become one of the most popular traditional facial landmark methods [4]. A convolutional expert network [6], which combined the advantages of neural architectures and mixtures of experts in an end-to-end framework, was proposed to improve the robustness of landmark prediction It required time-consuming training to build a model for the appearance of each facial landmark and poor initial sample selection led to poor learning results
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have