Abstract

Using hand skeleton data to understand complex hand actions, such as assembly tasks or kitchen activities, is an important yet challenging task. This paper introduces an unsupervised hand graph-based spatio-temporal feature extraction method. To evaluate the efficacy of the proposed representation, we consider action segmentation and recognition tasks. The segmentation problem involves an assembling task in an industrial setting, while the recognition problem deals with kitchen and office activities. For both tasks, we propose novel notions of stability, loss function stability (LFS) and estimation stability with cross-validation (ESCV), that are used to quantify the robustness of achieved solutions. Our proposed feature extraction leads to classification performance comparable to state of the art methods, while achieving significantly better accuracy and stability in a cross-person setting. The proposed method also outperforms the existing methods in the segmentation task in terms of accuracy and shows robustness to any change in the input hyper-parameters.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call