Abstract

While existing prediction models built on popular deep architectures have shown promising results in facial depression recognition, they still lack sufficient discriminative power due to the issues of 1) limited amount of labeled depression data for deep representation learning and, 2) large variation in facial expression across different persons of the same depression score and the subtle difference in facial expression across different depression levels. In this article, we formulate the facial depression recognition as a label distribution learning (LDL) problem, and propose a deep joint label distribution and metric learning (DJ-LDML) method to address these issues. In DJ-LDML, LDL exploits label relevance inherent in depression data to implicitly increase the amount of training data associated with each depression level without actually enlarging the dataset, while deep metric learning (DML) aims at learning a deep ordinal embedding with a specifically designed label-aware histogram loss, allowing semantics similarity between video sequences (described by ordinal labels) to be preserved for discriminative feature learning. The two learning modules in our DJ-LDML work collaboratively to enhance the representation ability and discriminative power of the deeply learned spatiotemporal feature, leading to improved depression prediction. We empirically evaluate our method on two benchmark datasets and the results demonstrate the effectiveness of our formulation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call