Self-Supervised Joint Encoding of Motion and Appearance for First Person Action Recognition

Mirco Planamente

doi:10.48448/aewj-7w74

Abstract

Wearable cameras are becoming more and more popular in several applications, increasing the interest of the research community in developing approaches for recognizing actions from the first-person point of view. An open challenge in egocentric action recognition is that videos lack detailed information about the main actor's pose and thus tend to record only parts of the movement when focusing on manipulation tasks. Thus, the amount of information about the action itself is limited, making crucial the understanding of the manipulated objects and their context. Many previous works addressed this issue with two-stream architectures, where one stream is dedicated to modeling the appearance of objects involved in the action, and another to extracting motion features from optical flow. In this paper, we argue that learning features jointly from these two information channels is beneficial to capture the spatio-temporal correlations between the two better. To this end, we propose a single stream architecture able to do so, thanks to the addition of a self-supervised block that uses a pretext motion prediction task to intertwine motion and appearance knowledge. Experiments on several publicly available databases show the power of our approach.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Self-Supervised Joint Encoding of Motion and Appearance for First Person Action Recognition

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Self-Supervised Joint Encoding of Motion and Appearance for First Person Action Recognition
Mirco Planamente ... Barbara Caputo
-
Mirco Planamente, et. al.Mirco Planamente ... Barbara Caputo
10 Jan 2021
10 Jan 2021

Egocentric Action Recognition by Automatic Relation Modeling.
Haoxin Li ... Haifeng Hu
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 45
Haoxin Li, et. al.Haoxin Li ... Haifeng Hu
01 Jan 2023
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 45

Mitigating Bystander Privacy Concerns in Egocentric Activity Recognition with Deep Learning and Intentional Image Degradation
Mariella Dimiccoli ... Edison Thomaz
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies | VOL. 1
Mariella Dimiccoli, et. al.Mariella Dimiccoli ... Edison Thomaz
08 Jan 2018
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies | VOL. 1

Wearable Cameras in Health
Aiden R Doherty ... Alan F Smeaton
American Journal of Preventive Medicine | VOL. 44
Aiden R Doherty, et. al.Aiden R Doherty ... Alan F Smeaton
01 Mar 2013
American Journal of Preventive Medicine | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Self-Supervised Joint Encoding of Motion and Appearance for First Person Action Recognition

Abstract

Talk to us

Similar Papers