Adaptive temporal convolutional network with multi-head EMA-gated attention for continuous radar-based human activity recognition
Adaptive temporal convolutional network with multi-head EMA-gated attention for continuous radar-based human activity recognition
- Research Article
- 10.3390/app15062905
- Mar 7, 2025
- Applied Sciences
Radar-based continuous human activity recognition (HAR) in realistic scenarios faces challenges in segmenting and classifying overlapping or concurrent activities. This paper introduces a feedback-driven adaptive segmentation framework for multi-label classification in continuous HAR, leveraging Bayesian optimization (BO) and reinforcement learning (RL) to dynamically adjust segmentation parameters such as segment length and overlap in the data stream, optimizing them based on performance metrics such as accuracy and F1-score. Using a public dataset of continuous human activities, the method trains ResNet18 models on spectrogram, range-Doppler, and range-time representations from a 20% computational subset. Then, it scales optimized parameters to the full dataset. Comparative analysis against fixed-segmentation baselines was made. The results demonstrate significant improvements in classification performance, confirming the potential of adaptive segmentation techniques in enhancing the accuracy and efficiency of continuous multi-label HAR systems.
- Research Article
75
- 10.1109/jmw.2023.3264494
- Jul 1, 2023
- IEEE Journal of Microwaves
Radar-based human motion and activity recognition is currently a topic of great research interest, as the aging population increases and older individuals prefer an independent lifestyle. This technology has a wide range of applications, such as fall detection in assisted living, gesture recognition for human-machine interfaces, and many more. Numerous studies exist on various approaches for radar-based activity capture and classification. However, most of these employ rather artificial data, often obtained in laboratory environments, and typically collected under particular conditions. Specifically, most research so far has aimed at distinguishing a predefined set of single activities with a defined start, stop and duration. This paper aims at drawing the attention to a so far less researched issue, one that will be of vital importance for future real-world application of radar-based human activity recognition: continuous activity recognition, i.e. recognizing specific activities in a stream of several sequential activities with unknown duration and arbitrary transitions between different classes of activities. A review on the current state of the art in this relatively new topic is given, followed by a discussion on future research directions.
- Research Article
3
- 10.14569/ijacsa.2020.0111074
- Jan 1, 2020
- International Journal of Advanced Computer Science and Applications
Human activity recognition has been an important task for the research community. With the introduction of deep learning architectures, the performance of activity recognition algorithms has improved significantly. However, most of the research in this area has focused on activity recognition for health/assisted living with other applications being given less attention. This paper considers continuous activity recognition in logistics (order picking and packing operations) using a convolutional neural network with temporal convolutions on inertial measurement sensor data from the recently released LARa dataset. Four variants of the popular CNN-IMU are experimented upon and a discussion of the results is provided. The results indicate that temporal convolutions are able to achieve satisfactory performance for some activities (hand center and cart) whereas they perform poorly for the activities of stand and hand up.
- Research Article
4
- 10.1088/1361-6501/ad9622
- Dec 4, 2024
- Measurement Science and Technology
The utilization of millimeter-wave radar sensors for continuous human activity recognition technology has garnered significant interest. Prior research predominantly concentrated on recursive neural networks, which often incorporate numerous extraneous information features, hindering the ability to make precise and effective predictions for ongoing activities. In response to this challenge, this paper introduces a dual-dilated one-dimensional temporal convolutional network model with an attention mechanism (R-ATCN). By stacking temporal convolutions to enhance the receptive field without compromising temporal resolution, the R-ATCN effectively captures features. Additionally, the attention mechanism is employed to capture crucial frame information related to activity transitions and overall features. The study gathered 60 data sets from 5 participants utilizing frequency modulated continuous wave radar. It encompassed 8 various activities lasting a total of 52.5 min, with randomized durations and transition times for each activity. To evaluate the performance of the model, this paper also introduces evaluation metrics such as short-time tolerance (STT) score. Experimental results show that the R-ATCN model outperforms other contrastive models in terms of segmental F1-score and STT scores. The effectiveness of the proposed model lies in its ability to accurately identify ongoing human activities within indoor environments.
- Conference Article
9
- 10.1109/icra.2014.6907789
- May 1, 2014
Most previous research has focused on classifying single human activities contained in segmented videos. However, in real-world scenarios, human activities are inherently contin- uous and gradual transitions always exist between temporally adjacent activities. In this paper, we propose a Fuzzy Segmen- tation and Recognition (FuzzySR) algorithm to explicitly model this gradual transition. Our goal is to simultaneously segment a given video into events and recognize the activity contained in each event. Specifically, our algorithm uniformly partitions the video into a sequence of non-overlapping blocks, each of which lasts a short period of time. Then, a multi-variable time series is creatively formed through concatenating the block-level human activity summaries that are computed using topic models over each block's local spatio-temporal features. By representing an event as a fuzzy set that has fuzzy boundaries to model gradual transitions, our algorithm is able to segment the video into a sequence of fuzzy events. By incorporating all block summaries contained in an event, the proposed algorithm determines the most appropriate activity category for each event. We evaluate our algorithm's performance using two real-world benchmark datasets that are widely used in the machine vision community. We also demonstrate our algorithm's effectiveness in important robotics applications, such as intelligent service robotics. For all used datasets, our algorithm achieves promising continuous human activity segmentation and recognition results.
- Research Article
83
- 10.1109/tgrs.2022.3189746
- Jan 1, 2022
- IEEE Transactions on Geoscience and Remote Sensing
Unconstrained human activities recognition with a radar network is considered. A hybrid classifier combining both CNNs and RNNs for spatial-temporal pattern extraction is proposed. The two-dimensional CNNs (2D-CNNs) are first applied to the radar data to perform spatial feature extraction on the input spectrograms. Subsequently, gated recurrent units with bidirectional implementations are used to capture the long- and short-term temporal dependencies in the feature maps generated by the 2D-CNNs. Three NN-based data fusion methods were explored and compared to utilize the rich information provided by the different radar nodes. The performance of the proposed classifier was validated rigorously using the K-fold CV and L1PO method. Unlike competitive research, the dataset with continuous human activities with seamless inter-activity transitions that can occur at any time and unconstrained moving trajectories of the participants has been collected and used for evaluation purposes. Classification accuracy of about 90.8% is achieved for nine-class HAR by the proposed classifier with the halfway fusion method.
- Conference Article
4
- 10.1109/radarconf2248738.2022.9764181
- Mar 21, 2022
Continuous Human Activity Recognition (HAR) in arbitrary directions is investigated using 5 spatially distributed pulsed Ultra-Wideband (UWB) radars. Such activities performed in arbitrary and unconstrained trajectories render a more natural occurrence of Activities of Daily Living (ADL) to be recognized. An innovative signal level fusion method was applied on the Range-Time (RT) maps, and deep learning classification via Recurrent Neural Networks (RNN) with and without bidi-rectionality was used on the computed micro-Doppler (μD) spectrogram. To assess classification performances, novel evaluation metrics accounting for the continuous nature of the sequence of activities and for imbalances in the dataset are proposed and compared with existing metrics. It is shown that conventional accuracy evaluation is too coarse, and that the proposed metrics need to be considered for a more comprehensive evaluation.
- Book Chapter
1
- 10.1007/978-3-030-29897-5_10
- Jan 1, 2020
With the advent and proliferation of wearable sensors, Human Activity Recognition (HAR) has received considerable research attention in recent years. Most existing HAR systems operate in a batch-processing (offline) mode, and they rely upon complex features from accelerometer readings for activity recognition. On the other hand, many applications such as continuous patient monitoring and elder fall detection demand real-time human activity recognition, and existing offline systems are inadequate for these applications. In this paper, we investigate challenges of real-time human activity recognition and present an effective framework based on a waveform pattern matching approach. We introduce the concept of A-Shapelets (activity shapelets), which is a representative pattern for each activity. Our framework incorporates several novel aspects; first, we present a scheme for computing the most distinctive A-Shapelet for each activity. Our scheme extracts repetitive patterns from wave forms. Second, our framework builds decision tree models using a personalized library of A-Shapelets. Third, we present a low-overhead matching algorithm for classifying incoming accelerometer data stream in real-time. This paper reports a series of experiments to evaluate the proposed framework. Our experiments demonstrate that the performance of our scheme is very good and the accuracy is comparable to offline HAR systems.
- Research Article
63
- 10.1109/mic.2015.115
- Sep 1, 2015
- IEEE Internet Computing
© 1997-2012 IEEE. Recent advancements in energy-harvesting hardware have created an opportunity for realizing batteryless wearables for continuous and pervasive human activity recognition (HAR). Unfortunately, power consumption of accelerometers used in conventional HAR is relatively high compared to the amount of power that can be harvested practically, which limits the usefulness of energy harvesting. Here, the authors present and evaluate a novel energy-harvesting wearable sensor architecture, HAR from Kinetic Energy (HARKE), that doesn't require using an accelerometer. Using off-the-shelf products, the authors demonstrate that the voltage of a kinetic harvester exhibits distinguishable patterns to distinctly infer human activities. Their results demonstrate that HARKE is as accurate as an accelerometer-based HAR system, yet consumes only a small fraction of the limited harvested energy.
- Research Article
- 10.1007/s11036-020-01658-5
- Oct 19, 2020
- Mobile Networks and Applications
Continuous activity recognition (CAR) plays an important role in human daily indoor activity monitoring and can be widely used in smart home, human-computer interaction and user authentication. Due to the privacy issue and limited coverage of video signals, RF-based CAR has attracted more and more attention in recent years. This paper focuses on three key problems in RF-based CAR: denoising, segmentation and recognition. We present the design and implementation of a contactless and sensorless continuous activity recognition system, namely WiCheck. Our basic idea is to utilize the temporal correlation between two adjacent actions in continuous activity to eliminate the cumulative error in continuous activity segmentation. Firstly, the multi-layer optimized noise elimination method is used to decrease the environment interference. Secondly, a method based on dual-swing window is proposed to reduce the cumulative error of continuous activity segmentation. Finally, WiCheck is implemented in different indoor environments, and 6 continuous activity sequences are designed to evaluate and analyze the influencing factors. The continuous activity recognition accuracy of WiCheck to two actions and three actions can approach 90% and 75%, respectively.
- Supplementary Content
6
- 10.3929/ethz-a-005228941
- May 1, 2006
- Repository for Publications and Research Data (ETH Zurich)
Abstract. Wearable computers promise the ability to access information and computing resources directly from miniature devices embedded in our clothing. The problem lies in how to access the most relevant information without disrupting whatever task it is we are doing. Most existing interfaces, such as keyboards and touch pads, require direct interaction. This is both a physical and cognitive distraction. The problem is particularly acute for the mobile maintenance worker who must access information, such as on-line manuals or schematics, quickly and with minimal distraction. One solution is a wearable computer that monitors the user’s ‘context’ - information such as activity, location and environment. Being ‘context aware’, the wearable would be better placed to offer relevant information to the user as and when it is needed. In this work we focus on recognising one of the most important parts of context: user activity. The contributions of the thesis are twofold. First, we present a method for recognising hand activities from a sequence using body worn sensors. Second, in evaluating this method, we present a generalised strategy for characterising the performance of activity recognition systems. We define a set of typical hand and tool activities in a woodwork assembly scenario. We evaluate two methods for detecting and recognising these activities using a combination of body-worn microphones and accelerometers. The first method uses two separately placed microphones on the user’s arm to locate the source of an activity. Whenever a sound is made close to the wrist, an interesting activity is assumed and classification is carried out using both acceleration and sound. The second method requires only wrist-worn sensors. It recognises activities by classifying sound and acceleration over a sliding window. The classifications from each of the sensor types are then compared, and a final result is given depending on how well they agree. In the second part of the thesis we introduce a strategy for evaluating the performance of continuous activity recognition systems. Like any area of scientific research, activity recognition requires standard methods and measures of performance. These are the tools with which researchers can compare and evaluate different systems, thus allowing the field to advance. Continuous activity recognition, however, has a number of performance issues which existing evaluation strategies - borrowed from related fields such as speech recognition - fail to capture. We explore these issues in depth and propose a new strategy of performance evaluation based on the complete characterisation of error types common to the activity recognition problem. Finally, we bring the two main topics of the thesis together. Using the results from our continuous recognition work we show the improvements of the proposed performance evaluation strategy over existing approaches.
- Research Article
4
- 10.1109/jsen.2025.3530921
- Mar 15, 2025
- IEEE Sensors Journal
Recently, human activity recognition (HAR) has gained significant attention as a research field, leading to the development of diverse technologies driven by its broad range of application scenarios. Radar technology has attracted much attention because of its unique advantages such as not being limited by environmental conditions such as light, shadow, and occlusion. In this article, a continuous HAR system based on multidomain radar data fusion (CMDN) is proposed. Firstly, in order to capture more detailed motion features of the human body, we apply the short-time fractional Fourier transform (STFrFT) to map radar data into the fractional domain, yielding a novel representation of human motion. Secondly, we develop an activity detector based on variable window length short-time average/long-time average (VW-STA/LTA) to accurately identify the start/end points of continuous human actions, addressing the challenge of difficult sequence segmentation in continuous activity recognition tasks. Finally, based on the multi-input multitask (MIMT) recognition network, the features of each domain are processed in parallel, and multiple input representations are fused to obtain the continuous activity classification results with high precision.
- Research Article
19
- 10.1109/thms.2015.2443037
- Oct 1, 2015
- IEEE Transactions on Human-Machine Systems
Understanding human activities is an essential capability for intelligent robots to help people in a variety of applications. Humans perform activities in a continuous fashion, and transitions between temporally adjacent activities are gradual. Our Fuzzy Segmentation and Recognition (FuzzySR) algorithm explicitly reasons about gradual transitions between continuous human activities. Our objective is to simultaneously segment a given video into a sequence of events and recognize the activity contained in each event. The algorithm uniformly segments the video into a sequence of nonoverlapping blocks, each lasting a short period of time. Then, a multivariable time series is formed by concatenating block-level human activity summaries that are computed using topic models over local spatiotemporal features extracted from each block. Through encoding an event as a fuzzy set with fuzzy boundaries to represent gradual transitions, our approach is capable of segmenting the continuous visual data into a sequence of fuzzy events. By incorporating all block summaries contained in an event, our algorithm determines the activity label for each event. To evaluate performance, we conduct experiments using six datasets. Our algorithm shows promising continuous activity segmentation results on these datasets and obtains the event-level activity recognition precision of 42.6%, 60.4%, 65.2%, and 78.9% on the Hollywood-2, CAD-60, ACT $4^2$ , and UTK-CAP datasets, respectively.
- Conference Article
3
- 10.1145/3014812.3018840
- Jan 31, 2017
- Proceedings of the Australasian Computer Science Week Multiconference
Advances in energy harvesting hardware have created an opportunity for realizing self-powered wearables for continuous and pervasive human activity recognition (HAR). Unfortunately, the power requirements of the continuous activity sensing using accelerometer sensors and the burdensome on-node classification are relatively high compared to the amount of power that can be practically harvested, which limit the energy harvesting's usefulness. This thesis proposes a novel paradigm for HAR, which employs kinetic energy harvesting (KEH) and infers human activities directly from the KEH patterns. This novel approach guarantees energy neutrality by eliminating the need for powering accelerometer and reducing the on-node classification overhead, moving us closer towards self-powered autonomous activity monitoring wearables.
- Research Article
20
- 10.3390/s24072199
- Mar 29, 2024
- Sensors (Basel, Switzerland)
Frameworks for human activity recognition (HAR) can be applied in the clinical environment for monitoring patients’ motor and functional abilities either remotely or within a rehabilitation program. Deep Learning (DL) models can be exploited to perform HAR by means of raw data, thus avoiding time-demanding feature engineering operations. Most works targeting HAR with DL-based architectures have tested the workflow performance on data related to a separate execution of the tasks. Hence, a paucity in the literature has been found with regard to frameworks aimed at recognizing continuously executed motor actions. In this article, the authors present the design, development, and testing of a DL-based workflow targeting continuous human activity recognition (CHAR). The model was trained on the data recorded from ten healthy subjects and tested on eight different subjects. Despite the limited sample size, the authors claim the capability of the proposed framework to accurately classify motor actions within a feasible time, thus making it potentially useful in a clinical scenario.