Abstract

We propose an approach for human interaction recognition (HIR) in videos using multinomial kernel logistic regression with group-of-features relevance (GFR-MKLR). Our approach couples kernel and group sparsity modelling to ensure highly precise interaction classification. The group structure in GFR-MKLR is chosen to reflect a representation of interactions at the level of gestures, which ensures more robustness to intra-class variability due to occlusions and changes in subject appearance, body size and viewpoint. The groups consist of motion features extracted from tracking interacting persons joints over time. We encode group sparsity in GFR-MKLR through relevance weights reflecting each group (gesture) discrimination capability between different interaction categories. These weights are automatically estimated during GFR-MKLR training using gradient descent minimisation. Our model is computationally efficient and can be trained on a small training dataset while maintaining a good generalization and interpretation capabilities. Experiments on the well-known UT-Interaction dataset have demonstrated the performance of our approach by comparison with state-of-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call