Abstract

In recent years, emotion recognition using physiological signals has become a popular research topic. Physiological signal can reflect the real emotional state for individual which is widely applied to emotion recognition. Multimodal signals provide more discriminative information compared with single modal which arose the interest of related researchers. However, current studies on multimodal emotion recognition normally adopt one-stage fusion method which results in the overlook of cross-modal interaction. To solve this problem, we proposed a multi-stage multimodal dynamical fusion network (MSMDFN). Through the MSMDFN, the joint representation based on cross-modal correlation is obtained. Initially, the latent and essential interactions among various features extracted independently from multiple modalities are explored based on specific manner. Subsequently, the multi-stage fusion network is designed to split the fusion procedure into multi-stages using the correlation observed before. This allows us to exploit much more fine-grained unimodal, bimodal and trimodal intercorrelations. For evaluation, the MSMDFN was verified on multimodal benchmark DEAP. The experiments indicate that our method outperforms the related one-stage multi-modal emotion recognition works.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.