Abstract

Human activity recognition (HAR) using body-worn sensors is an active research area in human-computer interaction and human activity analysis. The traditional methods use hand-crafted features to classify multiple activities, which is both heavily dependent on human domain knowledge and results in shallow feature extraction. Rapid developments in deep learning have caused most researchers to switch to deep learning methods, which extract features from raw data automatically. Most of the existing works on human activity recognition tasks involve multimodal sensor data, and these networks mainly focus on the top representation extracted from bottom-up feedforward process without reusing other features from bottom layers. In this paper, we present a novel hybrid deep learning network for human activity recognition that also employs multimodal sensor data; however, our proposed model is a ConvLSTM pipeline that makes full use of the information in each layer extracted along the temporal domain. Thus, we propose a dense connection module (DCM) to ensure maximum information flow between the network layers. Furthermore, we employ a multilayer feature aggregation module (MFAM) to extract features along the spatial domain, and we aggregate the features obtained from every convolutional layer according to the importance of features in different spatial locations. The output of the MFAM is input into two LSTM layers to further model the temporal dependencies. Finally, a fully connected layer and a softmax function are used to compute the probability of each class. We demonstrate the effectiveness of our proposed model on two benchmark datasets: Opportunity and UniMiB-SHAR. The results illustrate that our designed network outperforms the state-of-the-art models. We also conduct experiments on efficiency, multimodal fusion and different hyperparameters to analyze our proposed network. Finally, we carry out ablation and visualization experiments to reveal the effectiveness of the two proposed modules.

Highlights

  • The growing popularity of smart, wearable devices has greatly expanded the availability of time-series sensor data related to human activities

  • By combining the dense connection module and the multilayer feature aggregation module, we propose a novel hybrid network for human activity recognition based on an underlying ConvLSTM network

  • To improve the performance of the human activity recognition (HAR) system and design a smaller network for use in mobile devices, we propose using a novel hybrid model that fully aggregates features along both temporal and spatial domains; it requires fewer parameters when combined with a DeepConvLSTM [6]

Read more

Summary

INTRODUCTION

The growing popularity of smart, wearable devices has greatly expanded the availability of time-series sensor data related to human activities. T. Lv et al.: Hybrid Network Based on Dense Connection and Weighted Feature Aggregation for HAR. For multimodality time series data, a 1D convolution operation captures only local dependencies over time but does not make full use of the dependency between different channels of multiple sensors. Networks designed for human activity recognition focus mainly on the top representation extracted from the bottomup feedforward process and ignore other features from the bottom layers. By combining the dense connection module and the multilayer feature aggregation module, we propose a novel hybrid network for human activity recognition based on an underlying ConvLSTM network.

RELATED WORK
DENSE CONNECTION MODULE
MULTILAYER FEATURE AGGREGATION MODULE
EVALUATION
PERFORMANCE METRICS
MODEL TRAINING
RESULTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call