It is difficult to visually track a user’s hand because of the many degrees of freedom (DOF) a hand has. For this reason, most model-based hand pose tracking methods have relied on the use of multiview images or RGB-D images. This paper proposes a model-based method that accurately tracks three-dimensional hand poses using monocular RGB images in real time. The main idea of the proposed method is to reduce hand tracking ambiguity by adopting a step-by-step estimation scheme consisting of three steps performed in consecutive order: palm pose estimation, finger yaw motion estimation, and finger pitch motion estimation. In addition, this paper proposes highly effective algorithms for each step. With the assumption that a human hand can be considered as an assemblage of articulated planes, the proposed method uses a piece-wise planar hand model which enables hand model regeneration. The hand model regeneration modifies the hand model to fit the current user’s hand and improves the accuracy of the hand pose estimation results. Above all, the proposed method can operate in real time using only CPU-based processing. Consequently, it can be applied to various platforms, including egocentric vision devices such as wearable glasses. The results of several experiments conducted verify the efficiency and accuracy of the proposed method.
Read full abstract