Video facial expression recognition (FER) has garnered a lot of attention recently and is helpful for several applications. Although many algorithms demonstrate impressive performance in a controlled environment without occlusion, identification in the presence of partial facial occlusion remains a challenging issue. Solutions based on reconstructing the obscured area of the face have been suggested as a way to deal with occlusions. These options mostly rely on the face’s shape or texture. Nonetheless, the resemblance in facial expressions among individuals appears to be a valuable advantage for the reconstruction. For semantic segmentation based on occlusions, Reinforcement Learning (RL) is introduced as the initial stage. From a pool of unlabeled data, an agent learns a policy to choose a subset of tiny informative image patches to be tagged instead of full images. In the second stage, a trained Backtracking Search Algorithm (BSA) is used to rebuild optical flows that have been distorted by the occlusion. On obtaining optical flows estimated from occluded facial frames, AEs restore optical flows of occluded regions. These recovered optical flows become inputs to anticipate classes f expressions. Optical flux reconstructions then classify stages. This study evaluates classification model’s performances for face expression identification based on Very Deep Convolution Networks (VGGNet). Furthermore, it produces more accurate confusion matrices and proposes approaches for the KMU-FED and CK+ databases, respectively. The results are evaluated using metrics including recall, f-measure, accuracy, and precision.
Read full abstract