Abstract

The task of human action recognition (HAR) can be found in many computer vision practical applications. Various data modalities have been considered for solving this task, including joint-based skeletal representations which are suitable for real-time applications on platforms with limited computational resources. We propose a spatio-temporal neural network that uses handcrafted geometric features to classify human actions from video data. The proposed deep neural network architecture combines graph convolutional and temporal convolutional layers. The experiments performed on public HAR datasets show that our model obtains results similar to other state-of-the-art methods but has a lower inference time while offering the possibility to obtain an explanation for the classified action.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call