Egocentric Action Recognition Research Articles

Egocentric videos, which record the daily activities of individuals from a first-person point of view, have attracted increasing attention during recent years because of their growing use in many popular applications, including life logging, health monitoring and virtual reality. As a fundamental problem in egocentric vision, one of the tasks of egocentric action recognition aims to recognize the actions of the camera wearers from egocentric videos. In egocentric action recognition, relation modeling is important, because the interactions between the camera wearer and the recorded persons or objects form complex relations in egocentric videos. However, only a few of existing methods model the relations between the camera wearer and the interacting persons for egocentric action recognition, and moreover they require prior knowledge or auxiliary data to localize the interacting persons. In this work, we consider modeling the relations in a weakly supervised manner, i.e., without using annotations or prior knowledge about the interacting persons or objects, for egocentric action recognition. We form a weakly supervised framework by unifying automatic interactor localization and explicit relation modeling for the purpose of automatic relation modeling. First, we learn to automatically localize the interactors, i.e., the body parts of the camera wearer and the persons or objects that the camera wearer interacts with, by learning a series of keypoints directly from video data to localize the action-relevant regions with only action labels and some constraints on these keypoints. Second, more importantly, to explicitly model the relations between the interactors, we develop an ego-relational LSTM (long short-term memory) network with several candidate connections to model the complex relations in egocentric videos, such as the temporal, interactive, and contextual relations. In particular, to reduce human efforts and manual interventions needed to construct an optimal ego-relational LSTM structure, we search for the optimal connections by employing a differentiable network architecture search mechanism, which automatically constructs the ego-relational LSTM network to explicitly model different relations for egocentric action recognition. We conduct extensive experiments on egocentric video datasets to illustrate the effectiveness of our method.

• Current approaches are driven by data-hungry deep learning algorithms which require large amounts of annotated training data. • Deep learning models are inductive learners where the vocabulary is fixed and do not generalize beyond their training domain. • We address the problem of open world action recognition (i.e., unknown vocabulary) with Pattern Theory and Concept-Net. • Extensive experiments show our competitive performance for open world egocentric action recognition and object detection. Advances in deep learning have enabled the development of models that have exhibited a remarkable tendency to recognize and even localize actions in videos. However, they tend to experience errors when faced with scenes or examples beyond their initial training environment. Hence, they fail to adapt to new domains without significant retraining with large amounts of annotated data. In this paper, we propose to overcome these limitations by moving to an open-world setting by decoupling the ideas of recognition and reasoning. Building upon the compositional representation offered by Grenander’s Pattern Theory formalism, we show that attention and commonsense knowledge can be used to enable the self-supervised discovery of novel actions in egocentric videos in an open-world setting, where data from the observed environment (the target domain) is open i.e., the vocabulary is partially known and training examples (both labeled and unlabeled) are not available. We show that our approach can infer and learn novel classes for open vocabulary classification in egocentric videos and novel object detection with zero supervision . Extensive experiments show its competitive performance on two publicly available egocentric action recognition datasets (GTEA Gaze and GTEA Gaze+) under open-world conditions.

Egocentric Action Recognition Research Articles

Related Topics

Articles published on Egocentric Action Recognition

Cross-view action recognition understanding from exocentric to egocentric perspective

Egocentric activity recognition using two-stage decision fusion

Distilling interaction knowledge for semi-supervised egocentric action recognition

Continual Egocentric Activity Recognition With Foreseeable-Generalized Visual–IMU Representations

A Multi-Modal Egocentric Activity Recognition Approach towards Video Domain Generalization.

Bringing Online Egocentric Action Recognition Into the Wild

What we see is what we do: a practical Peripheral Vision-Based HMM framework for gaze-enhanced recognition of actions in a medical procedural task

Egocentric Action Recognition by Automatic Relation Modeling.

Enhanced Attention Tracking With Multi-Branch Network for Egocentric Activity Recognition

Knowledge guided learning: Open world egocentric action recognition with zero supervision

Egocentric Vision-based Action Recognition: A survey

Multi-Dataset, Multitask Learning of Egocentric Vision Tasks.

Learning to Recognize Actions on Objects in Egocentric Video With Attention Dictionaries.

STAC: Spatial-Temporal Attention on Compensation Information for Activity Recognition in FPV.

Trear: Transformer-Based RGB-D Egocentric Action Recognition

Symbiotic Attention for Egocentric Action Recognition With Object-Centric Alignment.

A hierarchical parallel fusion framework for egocentric ADL recognition based on discernment frame partitioning and belief coarsening

Multi-modal egocentric activity recognition using multi-kernel learning

Symbiotic Attention with Privileged Information for Egocentric Action Recognition

Progressive Motion Representation Distillation With Two-Branch Networks for Egocentric Activity Recognition

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Egocentric Action Recognition Research Articles

Related Topics

Articles published on Egocentric Action Recognition

Cross-view action recognition understanding from exocentric to egocentric perspective

Egocentric activity recognition using two-stage decision fusion

Distilling interaction knowledge for semi-supervised egocentric action recognition

Continual Egocentric Activity Recognition With Foreseeable-Generalized Visual–IMU Representations

A Multi-Modal Egocentric Activity Recognition Approach towards Video Domain Generalization.

Bringing Online Egocentric Action Recognition Into the Wild

What we see is what we do: a practical Peripheral Vision-Based HMM framework for gaze-enhanced recognition of actions in a medical procedural task

Egocentric Action Recognition by Automatic Relation Modeling.

Enhanced Attention Tracking With Multi-Branch Network for Egocentric Activity Recognition

Knowledge guided learning: Open world egocentric action recognition with zero supervision

Egocentric Vision-based Action Recognition: A survey

Multi-Dataset, Multitask Learning of Egocentric Vision Tasks.

Learning to Recognize Actions on Objects in Egocentric Video With Attention Dictionaries.

STAC: Spatial-Temporal Attention on Compensation Information for Activity Recognition in FPV.

Trear: Transformer-Based RGB-D Egocentric Action Recognition

Symbiotic Attention for Egocentric Action Recognition With Object-Centric Alignment.

A hierarchical parallel fusion framework for egocentric ADL recognition based on discernment frame partitioning and belief coarsening

Multi-modal egocentric activity recognition using multi-kernel learning

Symbiotic Attention with Privileged Information for Egocentric Action Recognition

Progressive Motion Representation Distillation With Two-Branch Networks for Egocentric Activity Recognition