Towards Real-Time Facial Landmark Detection in Depth Data Using Auxiliary Information

Connah Kendrick,Kevin Tan,Moi Hoon Yap,Kevin Walker

doi:10.3390/sym10060230

Connah Kendrick, Kevin Tan + Show 2 more

Open Access

https://doi.org/10.3390/sym10060230

Copy DOI

Abstract

Modern facial motion capture systems employ a two-pronged approach for capturing and rendering facial motion. Visual data (2D) is used for tracking the facial features and predicting facial expression, whereas Depth (3D) data is used to build a series of expressions on 3D face models. An issue with modern research approaches is the use of a single data stream that provides little indication of the 3D facial structure. We compare and analyse the performance of Convolutional Neural Networks (CNN) using visual, Depth and merged data to identify facial features in real-time using a Depth sensor. First, we review the facial landmarking algorithms and its datasets for Depth data. We address the limitation of the current datasets by introducing the Kinect One Expression Dataset (KOED). Then, we propose the use of CNNs for the single data stream and merged data streams for facial landmark detection. We contribute to existing work by performing a full evaluation on which streams are the most effective for the field of facial landmarking. Furthermore, we improve upon the existing work by extending neural networks to predict into 3D landmarks in real-time with additional observations on the impact of using 2D landmarks as auxiliary information. We evaluate the performance by using Mean Square Error (MSE) and Mean Average Error (MAE). We observe that the single data stream predicts accurate facial landmarks on Depth data when auxiliary information is used to train the network. The codes and dataset used in this paper will be made available.

Highlights

Motion capture using visual cameras is a common practice in high-end facial animation production
We evaluate the performance by using Mean Square Error (MSE) and Mean Average Error (MAE)
We examine the results of the testing set with both MSE and MAE scores

Summary

Introduction

Motion capture using visual cameras is a common practice in high-end facial animation production. With optical markers the addition of multiple cameras allows Depth information to be predicted. To optical markers, additional cameras allow capture of Depth information. With the availability of RGB with Depth (RGBD) sensors, the potential to increase accuracy is possible by merging the data streams within a neural network. Merging RGB and Depth allows a marker-less system to predict Depth without the requirement of multiple cameras with high accuracy. We conduct a complete investigation on the effect of different data streams, such as Gs, RGB, GsD, RGBD in 2D and 3D facial landmarks detection. By performing this investigation, we can determine the best solution for automated real-time 3D landmarks detection

Related Work

Facial Landmarking with Neural Networks

Merging Visual and Depth

Existing Datasets

Experimental Protocol

Equipment and Experimental Set up

Camera

Lighting

Frame Rate and Storage

Methodology

Results

Discussion and Conclusions

Materials and Methods

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Symmetry	Publication Date: Jun 17, 2018
Citations: 8	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Towards Real-Time Facial Landmark Detection in Depth Data Using Auxiliary Information

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry

Lead the way for us

Similar Papers

Facial Landmark Detection Using Machine Learning
Dr.R.Vasavi ... Md.Thaslima
International Journal For Multidisciplinary Research | VOL. 5
Dr.R.Vasavi , et. al.Dr.R.Vasavi ... Md.Thaslima
20 Dec 2023
International Journal For Multidisciplinary Research | VOL. 5

Towards Improved Human Action Recognition Using Convolutional Neural Networks and Multimodal Fusion of Depth and Inertial Sensor Data
Zeeshan Ahmad ... Naimul Khan
-
Zeeshan Ahmad, et. al.Zeeshan Ahmad ... Naimul Khan
01 Dec 2018
01 Dec 2018

Review: The evolution of chemometrics coupled with near infrared spectroscopy for fruit quality evaluation. II. The rise of convolutional neural networks
Jeremy Walsh ... Anand Koirala
Journal of Near Infrared Spectroscopy | VOL. 31
Jeremy Walsh, et. al.Jeremy Walsh ... Anand Koirala
23 May 2023
Journal of Near Infrared Spectroscopy | VOL. 31

Advanced Analysis of 3D Kinect Data: Supervised Classification of Facial Nerve Function via Parallel Convolutional Neural Networks
Mohsen Shayestegan ... Jan Mareš
Applied Sciences | VOL. 12
Mohsen Shayestegan, et. al.Mohsen Shayestegan ... Jan Mareš
09 Jun 2022
Applied Sciences | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards Real-Time Facial Landmark Detection in Depth Data Using Auxiliary Information

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Symmetry