Abstract

Usage of depth sensors in activity recognition is an emerging technology in human-computer interaction. This study presents an approach to recognise human-to-human interactions by using depth information. Both hand-crafted features and deep features extracted from depth frames are studied. After selecting and ranking strong features with Relieff algorithm, depth frames are assigned to words. Then, interaction sequences are represented as histograms of words and non-linear input mapping is applied over histogram bins to minimise differences among various subjects. Random forest, K-nearest neighbour, and support vector machine (SVM) classifiers are trained using these histograms. The final model is tested on SBU and K3HI datasets and compared with the methods in the literature. In the experiments, joint distances, joint angles and spherical coordinates of the joints were the best performing features. The most successful results are obtained with the composite kernel SVM with Relieff and input mapping methods. While Relieff algorithm helps to select and rank the best features in the feature set, input mapping reduces differences among interactions of various actors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call