Marker-based motion capture (mocap) is a conventional method used in biomechanics research to precisely analyze human movement. However, the time-consuming marker placement process and extensive post-processing limit its wider adoption. Therefore, markerless mocap systems that use deep learning to estimate 2D keypoint from images have emerged as a promising alternative, but annotation errors in training datasets used by deep learning models can affect estimation accuracy. To improve the precision of 2D keypoint annotation, we present a method that uses anatomical landmarks based on marker-based mocap. Specifically, we use multiple RGB cameras synchronized and calibrated with a marker-based mocap system to create a high-quality dataset (RRIS40) of images annotated with surface anatomical landmarks. A deep neural network is then trained to estimate these 2D anatomical landmarks and a ray-distance-based triangulation is used to calculate the 3D marker positions. We conducted extensive evaluations on our RRIS40 test set, which consists of 10 subjects performing various movements. Compared against a marker-based system, our method achieves a mean Euclidean error of 13.23 mm in 3D marker position, which is comparable to the precision of marker placement itself. By learning directly to predict anatomical keypoints from images, our method outperforms OpenCap's augmentation of 3D anatomical landmarks from triangulated wild keypoints. This highlights the potential of facilitating wider integration of markerless mocap into biomechanics research. The RRIS40 test set is made publicly available for research purposes at koonyook.github.io/rris40.
Read full abstract