Grasping is a fundamental action in daily life and particularly evident during mealtime situations where various grasping actions occur with tableware such as chopsticks, spoons, forks, bowls, and cups, each serving specific purposes. While tableware usage varies across regions and cultures, recognizing grasping actions is crucial for assessing performance in daily activities. In this study, we focus on assessing grasping functionality in terms of tableware usage during meals and propose a method for identifying hand movements. In recent years, there has been a surge in developing approaches for hand pose estimation and gesture recognition using deep learning. However, these approaches encounter common challenges, including the need for large-scale datasets, hyperparameter tuning, significant time and computational costs, and limited applicability to incremental learning. To address these challenges, we propose an ensemble approach employing extreme learning machines to recognize grasp postures. In addition, we apply spatiotemporal modeling to extract the relationship between grasp postures and the surrounding tools during mealtimes.
Read full abstract