If a person is truly healthy, his/her well-being encompasses both physical and psychological health. However, the existing IoT-eHealth system typically focus only on monitoring the user’s physical data through various sensors, neglecting their mental state. To enhance the intelligence level of IoT-eHealth system and enable it to have the psychological monitoring ability, a novel collaborative model based on Graph Convolutional Network (GCN) and Transformer is designed for Micro-Expression (ME) recognition in this paper. Firstly, facial information within each frame is transformed into a Spatial Topological Relationship Graph (STRG) by using facial landmarks detection and psychological relationship of local patches. Then, in order to automatically aggregate the key information on facial patches that contribute to ME recognition from the structured graph data, a Hierarchical Adaptive Graph Pooling (HAGP) module is designed for obtaining discriminative frame-level feature based on GCN utilizing graph structure and vertex global dependencies. Finally, in order to model the long-term dependencies among frames and capture the key frame-level features that are beneficial for ME recognition, a Temporal Sensitive Self-Attention (TSSA) mechanism is designed, and a novel Temporal Sensitive Transformer (TST) encoder is constructed based on TSSA to explore the evolution law of facial patterns and obtain discriminative video-level features that are helpful for ME recognition. In the comparative experiments of standard dataset verification and practical dataset testing, designed collaborative model is superior to other methods and can achieve the highest recognition accuracy, which almost can meet the application requirements of IoT-eHealth system.
Read full abstract