Deep Learning for Sensor-based Human Activity Recognition

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

The vast proliferation of sensor devices and Internet of Things enables the applications of sensor-based activity recognition. However, there exist substantial challenges that could influence the performance of the recognition system in practical scenarios. Recently, as deep learning has demonstrated its effectiveness in many areas, plenty of deep methods have been investigated to address the challenges in activity recognition. In this study, we present a survey of the state-of-the-art deep learning methods for sensor-based human activity recognition. We first introduce the multi-modality of the sensory data and provide information for public datasets that can be used for evaluation in different challenge tasks. We then propose a new taxonomy to structure the deep methods by challenges. Challenges and challenge-related deep methods are summarized and analyzed to form an overview of the current research progress. At the end of this work, we discuss the open issues and provide some insights for future directions.

Similar Papers
  • Research Article
  • Cite Count Icon 133
  • 10.1109/tai.2021.3076974
Graph Convolutional Neural Network for Human Action Recognition: A Comprehensive Survey
  • Apr 1, 2021
  • IEEE Transactions on Artificial Intelligence
  • Tasweer Ahmad + 5 more

Video-based human action recognition is one of the most important and challenging areas of research in the field of computer vision. Human action recognition has found many pragmatic applications in video surveillance, human-computer interaction, entertainment, autonomous driving, etc. Owing to the recent development of deep learning methods for human action recognition, the performance of action recognition has significantly enhanced for challenging datasets. Deep learning techniques are mainly used for recognizing actions in images and videos comprising of Euclidean data. A recent development in deep learning methods is the extension of these techniques to non-Euclidean data or graph data with many nodes and edges. Human body skeleton resembles a graph, therefore, the graph convolutional network (GCN) is applicable to the non-Euclidean body skeleton. In the past few years, GCN has emerged as an important tool for skeleton-based action recognition. Therefore, we conduct a survey using GCN methods for action recognition. Herein, we present a comprehensive overview of recent GCN techniques for action recognition, propose a taxonomy for the categorization of GCN techniques for action recognition, carry out a detailed study of the benchmark datasets, enlist relevant resources and open-source codes, and finally provide an outline for future research directions and trends. To the best of authors' knowledge, this is the first survey for action recognition using GCN techniques.

  • Book Chapter
  • Cite Count Icon 26
  • 10.1007/978-0-85729-997-0_15
Modeling and Recognition of Complex Human Activities
  • Jan 1, 2011
  • Nandita M Nayak + 3 more

Activity recognition is a field of computer vision which has shown great progress in the past decade. Starting from simple single person activities, research in activity recognition is moving toward more complex scenes involving multiple objects and natural environments. The main challenges in the task include being able to localize and recognize events in a video and deal with the large amount of variation in viewpoint, speed of movement and scale. This chapter gives the reader an overview of the work that has taken place in activity recognition, especially in the domain of complex activities involving multiple interacting objects. We begin with a description of the challenges in activity recognition and give a broad overview of the different approaches. We go into the details of some of the feature descriptors and classification strategies commonly recognized as being the state of the art in this field. We then move to more complex recognition systems, discussing the challenges in complex activity recognition and some of the work which has taken place in this respect. Finally, we provide some examples of recent work in complex activity recognition. The ability to recognize complex behaviors involving multiple interacting objects is a very challenging problem and future work needs to study its various aspects of features, recognition strategies, models, robustness issues, and context, to name a few.

  • Research Article
  • 10.62617/mcb936
Application of action recognition and tactical optimization methods for rope skipping competitions based on artificial intelligence
  • Dec 30, 2024
  • Molecular & Cellular Biomechanics
  • Huan Zhang

To solve the problems that action recognition methods in rope skipping competitions rely on manual annotation and are prone to misjudgment in complex movements, this study implemented an AI-based rope skipping action recognition and tactical optimization method, using artificial intelligence technology to achieve efficient and accurate action recognition and tactical adjustment. The feature extraction of video frames is performed through Convolutional Neural Network (CNN), and the processed feature sequence is sent to Long Short-Term Memory (LSTM) network for processing, so as to achieve accurate recognition of rope skipping actions. To optimize the competition strategy, the Deep Q Network (DQN) is used to optimize the tactical execution. Experimental results show that the proposed model can recognize common rope skipping movements such as single jump, double-leg jump and cross jump with an average accuracy of 98.4%; the tactical strategy optimized by reinforcement learning significantly improves the performance of athletes, the jumping frequency increases by 4.59% and the error rate decreases by 0.986%. This study not only provides an intelligent evaluation and optimization solution for rope skipping competitions, but also has certain reference significance for action recognition and tactical decision-making in other sports.

  • Conference Article
  • Cite Count Icon 14
  • 10.1145/2448556.2448639
Classifier ensemble optimization for human activity recognition in smart homes
  • Jan 17, 2013
  • Iram Fatima + 3 more

Recognizing human activities is an active research area due to its applicability in many applications, such as assistive living and healthcare. Currently, the major challenges in activity recognition include the reliability of prediction of each classifier as they differ according to smart homes characteristics. It is not possible that one classifier always performs better than all the other classifiers for every possible situation. Therefore, in this paper, a method for activity recognition is proposed by optimizing the output of multiple classifiers with evolutionary algorithm. We combine the measurement level output of different classifiers in terms of weights for each activity class to make up the ensemble. Classifier ensemble learner generates activity rules by optimizing the prediction accuracy of weighted feature vectors to obtain significant improvement over raw classification. For the evaluation of the proposed method, experiments are performed on two real datasets from CASAS smart home. The results show that our method systematically outperforms single classifier and traditional multiclass models.

  • Research Article
  • Cite Count Icon 4
  • 10.1007/s12559-025-10513-2
Deep Learning and Attention-Based Methods for Human Activity Recognition and Anticipation: A Comprehensive Review
  • Oct 25, 2025
  • Cognitive Computation
  • Md Rezowan Shuvo + 2 more

In recent years, there has been a significant increase in research focused on Human Activity Analysis (HAA). This field has progressed from basic activity recognition tasks to addressing more challenging ones, such as predicting future human actions based on partially observed videos and even predicting actions before they happen. The evolution of HAA has been driven by recent advancements in attention-based models like Transformers, along with a wide range of applications from security surveillance to advanced monitoring systems, behaviour analysis, and more. A comprehensive review of HAA literature from 2017 to 2025, with a novel taxonomy emphasising activity recognition, prediction, and anticipation, is presented. We critically review and examine recognition methods from trimmed and untrimmed videos, context-aware and trajectory-based prediction, and short-term and long-term anticipation. Through a comprehensive analysis, we review and evaluate key aspects of this domain, including attention-based contextual comprehension, temporal dynamics modelling, and multi-model fusion methods. Furthermore, we critically examine and assess the public datasets utilised in driving this research forward, pinpointing limitations and primary challenges within this domain. Finally, the paper provides a summary of recent developments in HAA and suggests future directions, with the hope that it will serve as a valuable reference for researchers in the field.

  • Conference Article
  • Cite Count Icon 32
  • 10.1109/get.2016.7916717
A survey on Human action recognition from videos
  • Nov 1, 2016
  • Chandni J Dhamsania + 1 more

Human action recognition is a way of retrieving videos emerged from Content Based Video Retrieval (CBVR).It is a growing area of research in the field of computer vision nowadays. Human action recognition has gained popularity because of its wide applicability in automatic retrieval of videos of particular action using visual features. The most common stages for action recognition includes: object and human segmentation, feature extraction, activity detection and classification. This paper describes the application and challenges of human action recognition. Features and limitations of various methods for human action recognition are discussed. This paper introduces survey on different types of actions like single person action recognition, two person or person-object interaction and multiple people action recognition.

  • Research Article
  • Cite Count Icon 7
  • 10.3233/ais-180496
A probabilistic data-driven method for human activity recognition
  • Sep 28, 2018
  • Journal of Ambient Intelligence and Smart Environments
  • Pouya Foudeh + 2 more

This paper proposes a probabilistic, time efficient, data-driven method for human low and medium level activity recognition and indoor tracking. The obtained results can be applied to a probabilistic reasoner for high level activity recognition. The proposed method is tested on Opportunity, a dataset consisting of daily morning activities in a highly sensor-rich environment. The main objective of this research is to suggest and apply methods suitable for batch processing of big data. In this case, performance in terms of CPU time and efficiency in storage usage are the top priorities. We applied fast signal processing methods to compute proper features from different collections of sensor signals. The relevant collections of features are selected and fed into a classifier to obtain results in the form of probability for each instance belonging to available classes. Additionally, the most probable locations of each subject in the room are calculated by processing noisy data from location tags on the subjects' body. Afterwards, the proposed probabilistic data smoothing method is applied to further increase accuracy. To evaluate the methods, the most probable recognitions are benchmarked against the results of the Opportunity Challenge competitions as well as provided results by the Opportunity group. We also implemented a couple of well-known methods on the current dataset and compared them with ours. Moreover, the performance of different sensors assemblies is investigated. Our proposed method could obtain very close results in terms of accuracy while it is more optimal in terms of number of features and required time.

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/icieam51226.2021.9446389
Method for Undefined Complex Human Activity Recognition
  • May 17, 2021
  • E S Abramova + 2 more

The article considers the problem of undefined complex classes activity recognition. The human activity recognition relevance in the field of production and services is presented. The classification of physical activity types is described. The possibilities of using methods based on data and knowledge for human activity recognition are considered. An analysis of zero-shot learning is given. A functional model of the method for undefined complex human activity recognition is proposed. The datasets are described. The HH101 and HH105 datasets obtained from CASAS smart homes were used to conduct experimental studies. An experimental study of the developed method for human activity recognition is carried out.

  • Conference Article
  • Cite Count Icon 24
  • 10.1109/percom45495.2020.9127376
Human Activity Recognition with Deep Reinforcement Learning using the Camera of a Mobile Robot
  • Mar 1, 2020
  • Teerawat Kumrai + 4 more

This paper presents a new human activity recognition method that uses a camera mounted on a mobile robot. We assume that the robot's camera captures images of a person and recognizes his/her activities based on skeletal and visual features extracted from the images. A key issue encountered with this method for activity recognition is that it requires the robot to position itself so that it has an adequate field of view of the activities being conducted. For example, if the robot is directly behind a person while observing that person making tea, it will be difficult for the robot to distinguish that activity from other similar activities such as preparing a meal or washing dishes. Our method employs deep reinforcement learning to control the movements of the mobile robot that is observing the activities in order to maximize its recognition accuracy while minimizing its energy consumption related to its movement. We propose effective action- and state-space designs that can achieve early training convergence and highly accurate activity recognition by: (i) incorporating the confidence of the activity recognition output when evaluating the quality of the current state (position), (ii) incorporating the costs of subsequent actions when estimating values for those actions, and (iii) designing an effective action space that accelerates reinforcement learning by restricting the movement space of the robot to the circumference of a circle with a predefined radius centered on the person.

  • Research Article
  • Cite Count Icon 19
  • 10.3837/tiis.2013.11.018
A Genetic Algorithm-based Classifier Ensemble Optimization for Activity Recognition in Smart Homes
  • Nov 29, 2013
  • KSII Transactions on Internet and Information Systems
  • Iram Fatima + 3 more

Over the last few years, one of the most common purposes of smart homes is to provide human centric services in the domain of u-healthcare by analyzing inhabitants' daily living.Currently, the major challenges in activity recognition include the reliability of prediction of each classifier as they differ according to smart homes characteristics.Smart homes indicate variation in terms of performed activities, deployed sensors, environment settings, and inhabitants' characteristics.It is not possible that one classifier always performs better than all the other classifiers for every possible situation.This observation has motivated towards combining multiple classifiers to take advantage of their complementary performance for high accuracy.Therefore, in this paper, a method for activity recognition is proposed by optimizing the output of multiple classifiers with Genetic Algorithm (GA).Our proposed method combines the measurement level output of different classifiers for each activity class to make up the ensemble.For the evaluation of the proposed method, experiments are performed on three real datasets from CASAS smart home.The results show that our method systematically outperforms single classifier and traditional multiclass models.The significant improvement is achieved from 0.82 to 0.90 in the F-measures of recognized activities as compare to existing methods.

  • Research Article
  • Cite Count Icon 1
  • 10.3390/a18040235
Hybrid Deep Learning Methods for Human Activity Recognition and Localization in Outdoor Environments
  • Apr 18, 2025
  • Algorithms
  • Yirga Yayeh Munaye + 4 more

Activity recognition and localization in outdoor environments involve identifying and tracking human movements using sensor data, computer vision, or deep learning techniques. This process is crucial for applications such as smart surveillance, autonomous systems, healthcare monitoring, and human–computer interaction. However, several challenges arise in outdoor settings, including varying lighting conditions, occlusions caused by obstacles, environmental noise, and the complexity of differentiating between similar activities. This study presents a hybrid deep learning approach that integrates human activity recognition and localization in outdoor environments using Wi-Fi signal data. The study focuses on applying the hybrid long short-term memory–bi-gated recurrent unit (LSTM-BIGRU) architecture, designed to enhance the accuracy of activity recognition and location estimation. Moreover, experiments were conducted using a real-world dataset collected with the PicoScene Wi-Fi sensing device, which captures both magnitude and phase information. The results demonstrated a significant improvement in accuracy for both activity recognition and localization tasks. To mitigate data scarcity, this study utilized the conditional tabular generative adversarial network (CTGAN) to generate synthetic channel state information (CSI) data. Additionally, carrier frequency offset (CFO) and cyclic shift delay (CSD) preprocessing techniques were implemented to mitigate phase fluctuations. The experiments were conducted in a line-of-sight (LoS) outdoor environment, where CSI data were collected using the PicoScene Wi-Fi sensor platform across four different activities at outdoor locations. Finally, a comparative analysis of the experimental results highlights the superior performance of the proposed hybrid LSTM-BIGRU model, achieving 99.81% and 98.93% accuracy for activity recognition and location prediction, respectively.

  • Research Article
  • Cite Count Icon 2
  • 10.3390/bioengineering11111124
Non-Contact Cross-Person Activity Recognition by Deep Metric Ensemble Learning
  • Nov 7, 2024
  • Bioengineering
  • Chen Ye + 5 more

In elderly monitoring or indoor intrusion detection, the recognition of human activity is a key task. Owing to several strengths of Wi-Fi-based devices, including their non-contact and privacy protection, these devices have been widely applied in the area of smart homes. By the deep learning technique, numerous Wi-Fi-based activity recognition methods can realize satisfied recognitions, however, these methods may fail to recognize the activities of an unknown person without the learning process. In this study, using channel state information (CSI) data, a novel cross-person activity recognition (CPAR) method is proposed by a deep learning approach with generalization capability. Combining one of the state-of-the-art deep neural networks (DNNs) used in activity recognition, i.e., attention-based bi-directional long short-term memory (ABLSTM), the snapshot ensemble is the first to be adopted to train several base-classifiers for enhancing the generalization and practicability of recognition. Second, to discriminate the extracted features, metric learning is further introduced by using the center loss, obtaining snapshot ensemble-used ABLSTM with center loss (SE-ABLSTM-C). In the experiments of CPAR, the proposed SE-ABLSTM-C method markedly improved the recognition accuracies to an application level, for seven categories of activities.

  • Conference Article
  • Cite Count Icon 29
  • 10.1109/cac.2017.8243438
Deep learning based human action recognition: A survey
  • Oct 1, 2017
  • Zhimeng Zhang + 6 more

Human action recognition has attracted much attentions because of its great potential applications. With the rapid development of computer performance and Internet, the methods of human action recognition based on deep learning become mainstream and develop at a breathless pace. This paper will give a novel reasonable taxonomy and a review of deep learning human action recognition methods based on color videos, skeleton sequences and depth maps. In addition, some datasets and effective tricks in action recognition deep learning methods will be introduced, and also the development trend is discussed.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 8
  • 10.1186/1687-6180-2012-162
Human action recognition based on estimated weak poses
  • Jul 25, 2012
  • EURASIP Journal on Advances in Signal Processing
  • Wenjuan Gong + 2 more

We present a novel method for human action recognition (HAR) based on estimated poses from image sequences. We use 3D human pose data as additional information and propose a compact human pose representation, called a weak pose, in a low-dimensional space while still keeping the most discriminative information for a given pose. With predicted poses from image features, we map the problem from image feature space to pose space, where a Bag of Poses (BOP) model is learned for the final goal of HAR. The BOP model is a modified version of the classical bag of words pipeline by building the vocabulary based on the most representative weak poses for a given action. Compared with the standard k-means clustering, our vocabulary selection criteria is proven to be more efficient and robust against the inherent challenges of action recognition. Moreover, since for action recognition the ordering of the poses is discriminative, the BOP model incorporates temporal information: in essence, groups of consecutive poses are considered together when computing the vocabulary and assignment. We tested our method on two well-known datasets: HumanEva and IXMAS, to demonstrate that weak poses aid to improve action recognition accuracies. The proposed method is scene-independent and is comparable with the state-of-art method.

  • Book Chapter
  • 10.1007/978-3-031-28124-2_17
Research on Action Recognition Based on Zero-shot Learning
  • Jan 1, 2023
  • Hui Zhao + 2 more

At present, the research on human action recognition has achieved remarkable results and is widely used in various industries. Among them, human action recognition based on deep learning has developed rapidly. With sufficient labeled data, supervised learning methods can achieve satisfactory recognition performance. However, the diversification of motion types and the complexity of the video background make the annotation of human motion videos a lot of labor costs. This severely restricts the application of supervised human action recognition methods in practical scenarios. Since the zero-shot learning method can realize the recognition of unseen action categories without relying on a large amount of labeled data. In recent years, action recognition methods based on zero-shot learning have received great attention from researchers. In this paper, we propose an attention-based zero-shot action recognition model ADZSAR. We design a novel attention-based mechanism feature extraction method that introduces the current state-of-the-art semantic embedding model (Word2Vec). Experiments show that this method performs the best among similar zero-shot action recognition methods based on spatio-temporal features.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant