Contrastive self-supervised representation learning without negative samples for multimodal human action recognition.

Huaigang Yang,Zhenyu Xu,Huaqiang Yuan,Jun Zhou,Ziliang Ren

doi:10.3389/fnins.2023.1225312

Abstract

Action recognition is an important component of human-computer interaction, and multimodal feature representation and learning methods can be used to improve recognition performance due to the interrelation and complementarity between different modalities. However, due to the lack of large-scale labeled samples, the performance of existing ConvNets-based methods are severely constrained. In this paper, a novel and effective multi-modal feature representation and contrastive self-supervised learning framework is proposed to improve the action recognition performance of models and the generalization ability of application scenarios. The proposed recognition framework employs weight sharing between two branches and does not require negative samples, which could effectively learn useful feature representations by using multimodal unlabeled data, e.g., skeleton sequence and inertial measurement unit signal (IMU). The extensive experiments are conducted on two benchmarks: UTD-MHAD and MMAct, and the results show that our proposed recognition framework outperforms both unimodal and multimodal baselines in action retrieval, semi-supervised learning, and zero-shot learning scenarios.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Contrastive self-supervised representation learning without negative samples for multimodal human action recognition.

Abstract

Talk to us

Similar Papers

More From: Frontiers in Neuroscience

Lead the way for us

Journal: Frontiers in Neuroscience	Publication Date: Jul 5, 2023
License type: CC BY 4.0

Similar Papers

Learning Multimodal Representations by Symmetrically Transferring Local Structures
Bin Dong ... Kai Lu
Symmetry | VOL. 12
Bin Dong, et. al.Bin Dong ... Kai Lu
13 Sep 2020
Symmetry | VOL. 12

Mutual Information Regularization for Weakly-Supervised RGB-D Salient Object Detection
Aixuan Li ... Jing Zhang
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 34
Aixuan Li, et. al.Aixuan Li ... Jing Zhang
01 Jan 2024
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 34

Multi-modal knowledge graphs representation learning via multi-headed self-attention
Enqiang Wang ... Xukang Luo
Information Fusion | VOL. 88
Enqiang Wang, et. al.Enqiang Wang ... Xukang Luo
26 Jul 2022
Information Fusion | VOL. 88

Learning Joint Multimodal Representation with Adversarial Attention Networks
Feiran Huang ... Zhoujun Li
-
Feiran Huang, et. al.Feiran Huang ... Zhoujun Li
15 Oct 2018
15 Oct 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Contrastive self-supervised representation learning without negative samples for multimodal human action recognition.

Abstract

Talk to us

Similar Papers

More From: Frontiers in Neuroscience