Abstract

Despite a lot of work in excavating the emotion descriptor from the hidden information, learning an effective spatiotemporal feature is a challenging issue for micro-expression recognition due to the fact that the micro-expression has a small difference in dynamic change and occurs in localized facial regions. Therefore, these properties of micro-expression suggest that the representation is sparse in the spatiotemporal domain. In this letter, a high-performance spatiotemporal feature learning based on sparse transformer is presented to solve the above issue. We extract the strong associated spatiotemporal feature by distinguishing the spatial attention map and attentively fusing the temporal feature. Thus, the feature map extracted from the critical relation will be fully utilized, while the superfluous relation will be masked. Our proposed method achieves remarkable results compared to state-of-the-art methods, proving that the sparse representation can be successfully integrated into the self-attention mechanism for micro-expression recognition.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.