Abstract

This paper addresses the problem of predicting human actions in depth videos. Due to the complex spatiotemporal structure of human actions, it is difficult to infer ongoing human actions before they are fully executed. To handle this challenging issue, we first propose two new depth-based features called pairwise relative joint orientations (PRJOs) and depth patch motion maps (DPMMs) to represent the relative movements between each pair of joints and human-object interactions, respectively. The two proposed depth-based features are suitable for recognizing and predicting human actions in real-time fashion. Then, we propose a regression-based learning approach with a group sparsity inducing regularizer to learn action predictor based on the combination of PRJOs and DPMMs for a sparse set of joints. Experimental results on benchmark datasets have demonstrated that our proposed approach significantly outperforms existing methods for real-time human action recognition and prediction from depth data.

Highlights

  • Predicting ongoing human actions based on incomplete observations plays an important role in many real-world applications such as surveillance, clinical monitoring, and human-robot interaction

  • Experimental results on benchmark datasets have demonstrated that our proposed approach significantly outperforms existing methods for real-time human action recognition and prediction from depth data

  • Different from the previous depth-based features, in this paper, we propose the pairwise relative joint orientations (PRJOs) and depth patch motion maps (DPMMs) to characterize the spatiotemporal relations among joints and the depth appearance of human-object interaction for real-time action prediction

Read more

Summary

Introduction

Predicting ongoing human actions based on incomplete observations plays an important role in many real-world applications such as surveillance, clinical monitoring, and human-robot interaction. We first propose two new depth-based features called pairwise relative joint orientations (PRJOs) and depth patch motion maps (DPMMs) extracted from skeletal and depth map data. The PRJOs and DPMMs are used to represent the relative movement between each pair of joints and local depth appearance of interactions between human and environmental objects over the duration of a human action. These two features complement each other as a bundle for each individual joints and are suitable for real-time prediction. (3) We propose a depth appearance-based feature called depth patch motion maps (DPMMs) to characterize human-object interactions.

Related Work
Depth-Based Feature Construction
Group Sparse Regression-Based Learning Model
Experiments
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call