Haphazard Cuboids Feature Extraction for Micro-Expression Recognition

Gang Wang,Shucheng Huang,Zizhao Dong

doi:10.1109/access.2022.3214808

Abstract

Facial micro-expressions can reveal a person’s actual mental state and emotions. Therefore, it has crucial applications in many fields, such as lie detection, clinical medicine, and defense security. However, conventional methods have extracted features on designed facial regions to recognize micro-expressions, failing to effectively hit the micro-expression critical regions since micro-expressions are localized and asymmetric. Consequently, we propose the Haphazard Cuboids ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">HC ) feature extraction method, which generates target regions by haphazard sampling technique and then extracts micro-expression spatio-temporal features. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">HC consists of two modules: spatial patches generation ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">SPG ) and temporal segments generation ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TSG ). <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">SPG is assigned to generate localized facial regions, and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TSG is dedicated to generating temporal intervals. Through extensive experiments, we demonstrate the superiority of the proposed method. Afterward, we analyze two modules with conventional and deep-learning methods and find that they can significantly improve the performance of micro-expression recognition, respectively. Thereinto, we embed the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">SPG module into deep learning and experimentally demonstrate the effectiveness and superiority of our proposed sampling method in comparison with state-of-the-art methods. Furthermore, we analyze the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TSG module with the maximum overlapping interval ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">MOI ) method and find its coherence with the maximum interval of the apex frame distribution in CASME II and SAMM. Therefore, analogous to the human face’s region of interest (ROI), micro-expressions also inherit similar ROI in the temporal dimension, whose positions are highly relevant to the intensive moment, i.e., the apex frame.

Full Text