Abstract

SummaryHuman activity recognition is a well‐established research problem in ubiquitous computing. The increased dependency on various smart devices in our daily lives allows us to investigate the sensor data world produced by multimodal sensors embedded in smart devices. However, the raw sensor data are often unlabeled and annotating this vast amount of data are a costly exercise that can often lead to privacy breaches. Self‐supervised learning‐based approaches are at the forefront of learning semantic representation from unlabeled sensor data, including when applied to human activity recognition tasks. As inferring human activity depends on multimodal sensors, addressing the modality difference and inter‐modality dependencies in a model is an important process. This paper proposes a novel self‐supervised learning approach, modality aware contrastive learning (MACL), for representation learning using multimodal sensor data. The approach uses different sensing modalities to create different views of an input signal. Thus, the model is able to learn the representations by maximizing the similarity among different sensing modalities of the same input signal. Extensive experiments were performed on four publicly available human activity recognition data sets to verify the effectiveness of our proposed MACL method. The experimental evaluation results show that the MACL method attains a comparable performance for human activity recognition to the compared baseline models, directly exceeding the performance of models using standard augmentation transformation strategies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call