Abstract

In this paper, we go beyond the problem of recognizing human interactions using videos collected from CCTV-based surveillance systems. We propose an approach that permits to deeply describe common person-person activities in the daily life based on the human poses. The joint coordinates of detected human objects are first located by an impressive articulated-body estimation algorithm using the tree graphical structure technique. The relational features consisting of the intra and inter-person feature describing the joint distance and angle information are used for describing the relationships between body components of the individual persons and the interaction of two participants. Moreover, the interaction is also considered in the spatio-temporal dimension in order to upgrade the discrimination among complex activities having much homothetic representation. We validate our interaction recognition method on two practical datasets, the BIT-Interaction dataset and the UT-Interaction dataset, using the multi-class Support Vector Machine technique. The experimental results demonstrate that the proposed approach using pose-body features outperforms recent interaction recognition approaches in the term of classification accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call