Abstract

We propose robust multi-dimensional motion features for human activity recognition from first-person videos. The proposed features encode information about motion magnitude, direction and variation, and combine them with virtual inertial data generated from the video itself. The use of grid flow representation, per-frame normalization and temporal feature accumulation enhances the robustness of our new representation. Results on multiple datasets demonstrate that the proposed feature representation outperforms existing motion features, and importantly it does so independently of the classifier. Moreover, the proposed multi-dimensional motion features are general enough to make them suitable for vision tasks beyond those related to wearable cameras.

Highlights

  • Related worksAmbulatory activities such as Walk, Turn, Run, Sit, Stand, Go upstairs, Go downstairs and Left-right turn involve full-body motions

  • The extraction of both GOFF and virtual inertial feature (VIF) groups involve the appropriate setting of different parameters, namely grids G, window length L, overlapping ratio ν, direction bins βp, magnitude bins βm, frequency bands Nf and low-frequency coefficients Ns and Nc

  • The first state-ofthe-art method is an interest point-based motion feature extraction approach presented in Zhang et al [12,13], and referred to as multiresolution good-feature (MRGF) implemented with SURF, which was reported to achieve better accuracy than Shin and Tomasi features [35]

Read more

Summary

Related works

Ambulatory activities such as Walk, Turn, Run, Sit, Stand, Go upstairs, Go downstairs and Left-right turn involve full-body motions. Interest point-based methods generally fail when there is not enough texture to detect interest points, or when the activities (e.g., Dribble) involve complex ego-motion, motion blur and parallax These features are not appropriate to discriminate activities such as Jog and Run as they do not include specific motion characteristics other than direction (e.g., magnitude) [12,13]. While optical flow-based methods are frequently employed in the state of the art due to their sub-pixel accuracy and flexibility to work under different motion models [37], the majority of existing works in this category do not exploit key motion characteristics, such as magnitude that helps to discriminate activities with similar direction patterns (e.g., Jog, Run and Sprint).

Proposed method
Feature extraction
Parameters analysis
Computation time
Datasets and validation protocol
Datasets
Experimental setup
Baseline method
Methods
F KNN F KNN F KNN F KNN
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call