Abstract
Accurate and reliable human pose estimation (HPE) is essential in interactive systems, particularly for applications requiring personalized adaptation, such as controlling cooperative robots and wearable exoskeletons, especially for healthcare monitoring equipment. However, continuously maintaining diverse datasets and frequently updating models for individual adaptation are both resource intensive and time-consuming. To address these challenges, we propose a meta-transfer learning framework that integrates multimodal inputs, including high-frequency surface electromyography (sEMG), visual-inertial odometry (VIO), and high-precision image data. This framework improves both accuracy and stability through a knowledge fusion strategy, resolving the data alignment issue, ensuring seamless integration of different modalities. To further enhance adaptability, we introduce a training and adaptation framework with few-shot learning, facilitating efficient updating of encoders and decoders for dynamic feature adjustment in real-time applications. Experimental results demonstrate that our framework provides accurate, high-frequency pose estimations, particularly for intra-subject adaptation. Our approach enables efficient adaptation to new individuals with only a few new samples, providing an effective solution for personalized motion analysis with minimal data.
Published Version
Join us for a 30 min session where you can share your feedback and ask us any queries you have