Abstract

Although deep convolutional neural networks (CNNs) have achieved the state-of-the-arts for facial expression recognition (FER), FER is still challenging due to two aspects: class imbalance and hard expression examples. However, most existing FER methods recognize facial expression images by training the CNN models with cross-entropy (CE) loss in a single stage, which have limited capability to deal with these problems because each expression example is assigned equal weight of loss. Inspired by the recently proposed focal loss which reduces the relative loss for those well-classified expression examples and pay more attention to those misclassified ones, we can mitigate these problems by introducing the focal loss into the existing FER system when facing imbalanced data or hard expression examples. Considering that the focal loss allows the network to further extract discriminative features based on the learned feature-separating capability, we present a two-stage training strategy utilizing CE loss in the first stage and focal loss in the second stage to boost the FER performance. Extensive experiments have been conducted on two well-known FER datasets called CK+ and Oulu-CASIA. We gain improvements compared with the common one-stage training strategy and achieve the state-of-the-art results on the datasets in terms of average classification accuracy, which demonstrate the effectiveness of our proposed two-stage training strategy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call