Abstract

Human activity recognition (HAR) tasks have traditionally been solved using engineered features obtained by heuristic processes. Current research suggests that deep convolutional neural networks are suited to automate feature extraction from raw sensor inputs. However, human activities are made of complex sequences of motor movements, and capturing this temporal dynamics is fundamental for successful HAR. Based on the recent success of recurrent neural networks for time series domains, we propose a generic deep framework for activity recognition based on convolutional and LSTM recurrent units, which: (i) is suitable for multimodal wearable sensors; (ii) can perform sensor fusion naturally; (iii) does not require expert knowledge in designing features; and (iv) explicitly models the temporal dynamics of feature activations. We evaluate our framework on two datasets, one of which has been used in a public activity recognition challenge. Our results show that our framework outperforms competing deep non-recurrent networks on the challenge dataset by 4% on average; outperforming some of the previous reported results by up to 9%. Our results show that the framework can be applied to homogeneous sensor modalities, but can also fuse multimodal sensors to improve performance. We characterise key architectural hyperparameters’ influence on performance to provide insights about their optimisation.

Highlights

  • Recognizing human activities and the context in which they occur from sensor data is at the core of smart assistive technologies, such as in smart homes [1], in rehabilitation [2], in health support [3,4], in skill assessment [5] or in industrial settings [6]

  • We evaluate DeepConvLSTM on two human activity recognition datasets and compare the performance against the baseline Convolutional neural networks (CNNs), which provides a performance reference for deep networks, and against results reported in the literature on these datasets using other machine learning techniques

  • It can be noticed how DeepConvLSTM offers a striking performance improvement: there is more than a 9% improvement in the gesture recognition task without the Null class when compared to the OPPORTUNITY

Read more

Summary

Introduction

Recognizing human activities (e.g., from simple hand gestures to complex activities, such as “cooking a meal”) and the context in which they occur from sensor data is at the core of smart assistive technologies, such as in smart homes [1], in rehabilitation [2], in health support [3,4], in skill assessment [5] or in industrial settings [6]. Many scenarios of high societal value are still elusive, such as providing “memory prosthesis” to people with dementia, inserting subtle cues in everyday life in the right context to support voluntary behaviour change (e.g., to fight obesity), or enabling natural human-robot interaction in everyday settings. These scenarios require a minute understanding of the activities of the person at home and out and about. The OPPORTUNITYchallenge that was run in 2011 aiming at recognising activities in a home environment showed that contenders did not reach an accuracy

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call