Abstract

Modern smartphones and wearables often contain multiple embedded sensors which generate significant amounts of data. This information can be used for body monitoring-based areas such as healthcare, indoor location, user-adaptive recommendations and transportation. The development of Human Activity Recognition (HAR) algorithms involves the collection of a large amount of labelled data which should be annotated by an expert. However, the data annotation process on large datasets is expensive, time consuming and difficult to obtain. The development of a HAR approach which requires low annotation effort and still maintains adequate performance is a relevant challenge. We introduce a Semi-Supervised Active Learning (SSAL) based on Self-Training (ST) approach for Human Activity Recognition to partially automate the annotation process, reducing the annotation effort and the required volume of annotated data to obtain a high performance classifier. Our approach uses a criterion to select the most relevant samples for annotation by the expert and propagate their label to the most confident samples. We present a comprehensive study comparing supervised and unsupervised methods with our approach on two datasets composed of daily living activities. The results showed that it is possible to reduce the required annotated data by more than 89% while still maintaining an accurate model performance.

Highlights

  • Over the last years, the technological advances on ubiquitous sensing mechanisms allowed the proliferation of available data, which often is unlabelled

  • The Continuous Activities of Daily Living (CADL) was used to provide a validation in a scenario more closely with the real-world requirements

  • The volume of the recorded data by these equipment is significant and poses challenges on the development of traditional machine learning approaches that rely on annotated data to guarantee accurate model performance

Read more

Summary

Introduction

The technological advances on ubiquitous sensing mechanisms allowed the proliferation of available data, which often is unlabelled. Modern machine learning approaches require large amounts of labelled data to achieve adequate performance. This duality raises a relevant question: How can we simultaneously optimise the process of data annotation and still learn an accurate machine learning model?. The Human Activity Recognition (HAR) field has been a source of a large quantity of available data, mostly due to its myriad of applications on real-life scenarios such as healthcare, indoor location, user-adaptive recommendations and transportation [1,2]. Organization [3] insufficient physical activity has been identified as the fourth leading risk factor for global mortality, being one of the main causes of several health diseases and correlated with overweight and obesity. The practice of physical exercise is correlated with an increase of cardio-respiratory and muscular fitness, functional health, cognitive functions and improvement

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call