Deep learning (DL)-based Human Activity Recognition (HAR) using wearable inertial measurement unit (IMU) sensors can revolutionize continuous health monitoring and early disease prediction. However, most DL HAR models are untested in their robustness to real-world variability, as they are trained on limited lab-controlled data. In this study, we isolated and analyzed the effects of the subject, device, position, and orientation variabilities on DL HAR models using the HARVAR and REALDISP datasets. The Maximum Mean Discrepancy (MMD) was used to quantify shifts in the data distribution caused by these variabilities, and the relationship between the distribution shifts and model performance was drawn. Our HARVAR results show that different types of variability significantly degraded the DL model performance, with an inverse relationship between the data distribution shifts and performance. The compounding effect of multiple variabilities studied using REALDISP further underscores the challenges of generalizing DL HAR models to real-world conditions. Analyzing these impacts highlights the need for more robust models that generalize effectively to real-world settings. The MMD proved valuable for explaining the performance drops, emphasizing its utility in evaluating distribution shifts in HAR data.
Read full abstract