Activity Recognition Research Articles

Skeleton-based action recognition is beneficial for understanding human behavior in videos, and thus has received much attention in recent years as an important research area in action recognition. Current research focuses on designing more advanced algorithms to better extract spatio-temporal information from skeleton data. However, due to the small amount of data in the existing skeleton dataset and the lack of effective data augmentation methods, it is easy to lead to overfitting in model training. To address this challenge, we propose a mix-based data augmentation method, Joint Mixing Data Augmentation (JMDA), which can generally improve the effectiveness and robustness of various skeleton-based action recognition algorithms. In terms of spatial information, we introduce SpatialMix (SM), a method that projects the original 3D skeleton discrete information into a 2D space. Then, SM mixes the projected spatial information between two random samples during the training process to achieve the spatial-based mixing data augmentation. Concerning temporal information, we propose TemporalMix (TM). Leveraging the temporal continuity in skeleton data, we perform a temporal resize operation on the original skeleton data, and then merge two random samples during training to achieve the temporal-based mixed data augmentation. Additionally, we analyze the Feature Mismatch (FM) problem caused by introducing mix-based data augmentation into skeleton data. Then we propose a new data preprocessing method called Feature Alignment (FA) to effectively address this problem and improve model performance. Moreover, we propose a novel training pipeline, Joint Training Strategy (JTS), which combines multiple mix-based data augmentation methods for further improvement of model performance. Specifically, our proposed JMDA is plug-and-play and widely applicable to skeleton-based action recognition models. At the same time, the application of JMDA does not increase the model parameters and there is almost no additional training cost. We conduct extensive experiments on NTU RGB+D 60 and NTU RGB+D 120 datasets to demonstrate the effectiveness and robustness of the proposed JMDA on several mainstream skeleton-based action recognition algorithms.

Read full abstract

Situation recognition is an crucial problem in scene understanding, activity understanding, and action reasoning as it provides a structured representation of the main activity depicted in the image.Semantic role labeling is crucial to situation recognition, which is challenging because a single action can have multiple meanings and purposes depending on its context. Understanding images beyond the highlighted actions requires inferences about the context of the scene, the objects, and their role in the captured event. Recently, situation recognition (SR) has been introduced, which jointly derives a collection of the action (activity), meaning-role, and noun (entities) pairs in the form of moving images. To label these frames as action frames, we must assign nouns (entities) to the role based on the content of the observed image. One of the main challenges is managing the complex dependencies between the assigned roles (nouns) and the predicted action, as the correct role assignment often depends on the accuracy of the action prediction. We introduce, RoadSitu, a road situation recognition that involves generating a structured summary of what is happening in a road scenario using an action and the semantic roles played by agents from a video frame. The action can describe a diverse set of situations, and the same agent can play various roles depending on the situation depicted in the video frame. Therefore, a situation recognition model needs to understand the context of each video frame and the visual-linguistic meaning of the semantic roles of that particular frame. One of the main challenges in this work is the complex task of annotating video frames with semantic roles and handling the structured dependencies between the assigned roles (nouns) and the predicted action (activity). Additionally, the sparsity of meaningful semantic information within road scenarios poses further difficulties. To overcome these challenges, we introduce a novel approach where action recognition and noun estimation work together interactively to form structured summaries of each situation. In experiments using a road video dataset obtained from a South Korean company, RoadSitu achieved significant improvements across various performance metrics, with a Top-1 verb accuracy of 43.46%, Top-5 verb accuracy of 72.48%, and value accuracy of 34.21%, outperforming baseline models such as GSRTR and JSL by 2.4% and 3.86% in Top-1 verb accuracy, respectively. These results demonstrate the effectiveness of our model in handling complex road scenarios.

Read full abstract

Activity Recognition Research Articles

Related Topics

Articles published on Activity Recognition

Commodity Wi-Fi-Based Wireless Sensing Advancements over the Past Five Years

A kinematic dataset of locomotion with gait and sit-to-stand movements of young adults

Exploring Cutout and Mixup for Robust Human Activity Recognition on Sensor and Skeleton Data

Non-Contact Cross-Person Activity Recognition by Deep Metric Ensemble Learning

Joint Mixing Data Augmentation for Skeleton-based Action Recognition

ACORN+: Adaptive Compression-Reconstruction for Device-Cloud Collaboration Video Services

Recognizing human activities using light-weight and effective machine learning methodologies

Contextual motion-aware for group activity recognition

Real-life boxing activity recognition with smartphones using attention assisted deep learning models

Extended delays in recognition of stroke symptoms and stroke code activation for in-hospital strokes: The DELAY study.

Lower Limb Motion Recognition Based on sEMG and CNN-TL Fusion Model.

METAHEURISTIC-ASSISTED HYBRID RECOGNITION MODEL FOR BRAIN ACTIVITY DETECTION

AUTOMATED INDOOR ACTIVITY MONITORING FOR ELDERLY AND VISUALLY IMPAIRED PEOPLE USING QUANTUM SALP SWARM ALGORITHM WITH MACHINE LEARNING

A benchmark for domain adaptation and generalization in smartphone-based human activity recognition

RoadSitu: Leveraging Road Video Frame Extraction and Three-Stage Transformers for Situation Recognition

OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning

A smart home energy management system based on human activity recognition and deep reinforcement learning

Enhancing inertial sensor-based sports activity recognition through reduction of the signals and deep learning

Using Microwave Simulations for the Generation of Radar Data for Gesture and Activity Recognition: Potential and Challenges for Real-World Applications

Multistream Adaptive Attention-Enhanced Graph Convolutional Networks for Youth Fencing Footwork Training.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Activity Recognition Research Articles

Related Topics

Articles published on Activity Recognition

Commodity Wi-Fi-Based Wireless Sensing Advancements over the Past Five Years

A kinematic dataset of locomotion with gait and sit-to-stand movements of young adults

Exploring Cutout and Mixup for Robust Human Activity Recognition on Sensor and Skeleton Data

Non-Contact Cross-Person Activity Recognition by Deep Metric Ensemble Learning

Joint Mixing Data Augmentation for Skeleton-based Action Recognition

ACORN+: Adaptive Compression-Reconstruction for Device-Cloud Collaboration Video Services

Recognizing human activities using light-weight and effective machine learning methodologies

Contextual motion-aware for group activity recognition

Real-life boxing activity recognition with smartphones using attention assisted deep learning models

Extended delays in recognition of stroke symptoms and stroke code activation for in-hospital strokes: The DELAY study.

Lower Limb Motion Recognition Based on sEMG and CNN-TL Fusion Model.

METAHEURISTIC-ASSISTED HYBRID RECOGNITION MODEL FOR BRAIN ACTIVITY DETECTION

AUTOMATED INDOOR ACTIVITY MONITORING FOR ELDERLY AND VISUALLY IMPAIRED PEOPLE USING QUANTUM SALP SWARM ALGORITHM WITH MACHINE LEARNING

A benchmark for domain adaptation and generalization in smartphone-based human activity recognition

RoadSitu: Leveraging Road Video Frame Extraction and Three-Stage Transformers for Situation Recognition

OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning

A smart home energy management system based on human activity recognition and deep reinforcement learning

Enhancing inertial sensor-based sports activity recognition through reduction of the signals and deep learning

Using Microwave Simulations for the Generation of Radar Data for Gesture and Activity Recognition: Potential and Challenges for Real-World Applications

Multistream Adaptive Attention-Enhanced Graph Convolutional Networks for Youth Fencing Footwork Training.