Landmark-based multimodal human action recognition

Stylianos Asteriadis,Petros Daras

doi:10.1007/s11042-016-3945-6

Abstract

Human activity recognition has received a lot of attention recently, mainly thanks to the advancements in sensing technologies and systems’ increasing computational power. However, complexity in human movements, sensing devices’ noise and person-specific characteristics impose challenges that still remain to be overcome. In the proposed work, a novel, multi-modal human action recognition method is presented for handling the aforementioned issues. Each action is represented by a basis vector and spectral analysis is performed on an affinity matrix of new action feature vectors. Using modality-dependent kernel regressors for computing the affinity matrix, complexity is reduced and robust low-dimensional representations are achieved. The proposed scheme supports online adaptivity of modalities, in a dynamic fashion, according to their automatically inferred reliability. Evaluation on three publicly available datasets demonstrates the potential of the approach.

Highlights

Human-machine interaction is entering a new era, with computers altering the way they respond to human stimuli
Feature pre-processing is strongly related to the utilized cue, in problems related to human activity recognition
For new data vectors, no local sub-manifold unfolding is necessary and, for inference, simple matrix operations are needed. This is of great significance, since it allows for real-time action recognition and constitutes the proposed method appropriate for online evaluation of whether the projection of multiple modality features over the course of an action is close to the subspace classes of a trained model

Summary

Introduction

Human-machine interaction is entering a new era, with computers altering the way they respond to human stimuli. Wearable inertial measurement sensors [11], robust video processing algorithms [1], infrared and depth sensors [7] and audio [27] are only a few of the cues available for understanding human activity These advances brought automatic action recognition to the front-end in many applications, ranging from entertainment to health-care systems. A low-dimensional representation of large dimensionality feature vectors is utilized, by following a landmark-based spectral analysis scheme In this way, low-dimensional subspaces, encoding valuable information, are built, while new, unknown actions are projected on them. The proposed technique builds on authors’ preliminary work on Microsoft kinect-based activity recognition based on spectral analysis, [3] where results were presented on the single-modality case of only depth data, while inter- and intra-individual sub-actions were not considered and experiments were limited to a single scenario.

Related work

Landmark-based action recognition

Dynamic fusion of different modalities

Classification of new instances

Skoda Mini Checkpoint Dataset

Nj i e σij

Berkeley MHAD database

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Multimedia Tools and Applications	Publication Date: Sep 19, 2016
Citations: 13	License type: open-access

R Discovery Prime

R Discovery Prime

Landmark-based multimodal human action recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Multimedia Tools and Applications

Lead the way for us

Similar Papers

Skeleton-based human action recognition using basis vectors
Stylianos Asteriadis ... Petros Daras
-
Stylianos Asteriadis, et. al.Stylianos Asteriadis ... Petros Daras
01 Jul 2015
01 Jul 2015

Multi-Modal Human Action Recognition With Sub-Action Exploiting and Class-Privacy Preserved Collaborative Representation Learning
Chengwu Liang ... Lin Qi
IEEE Access | VOL. 8
Chengwu Liang, et. al.Chengwu Liang ... Lin Qi
01 Jan 2020
IEEE Access | VOL. 8

From CNNs to Transformers in Multimodal Human Action Recognition: A Survey
Muhammad Bilal Shaikh ... Syed Muhammad Shamsul Islam
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. 20
Muhammad Bilal Shaikh, et. al.Muhammad Bilal Shaikh ... Syed Muhammad Shamsul Islam
09 Jul 2024
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. 20

Multimodal fusion for audio-image and video action recognition
Muhammad Bilal Shaikh ... Naveed Akhtar
Neural Computing and Applications | VOL. 36
Muhammad Bilal Shaikh, et. al.Muhammad Bilal Shaikh ... Naveed Akhtar
09 Jan 2024
Neural Computing and Applications | VOL. 36

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Landmark-based multimodal human action recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Multimedia Tools and Applications