Combining CNN and LSTM for activity of daily living recognition with a 3D matrix skeleton representation

Giovanni Ercolano,Silvia Rossi

doi:10.1007/s11370-021-00358-7

Giovanni Ercolano, Silvia Rossi

Open Access

https://doi.org/10.1007/s11370-021-00358-7

Copy DOI

Journal: Intelligent Service Robotics	Publication Date: Mar 10, 2021
Citations: 15	License type: open-access

Affiliation: University of Naples Federico II

Abstract

In socially assistive robotics, human activity recognition plays a central role when the adaptation of the robot behavior to the human one is required. In this paper, we present an activity recognition approach for activities of daily living based on deep learning and skeleton data. In the literature, ad hoc features extraction/selection algorithms with supervised classification methods have been deployed, reaching an excellent classification performance. Here, we propose a deep learning approach, combining CNN and LSTM, that exploits both the learning of spatial dependencies correlating the limbs in a skeleton 3D grid representation and the learning of temporal dependencies from instances with a periodic pattern that works on raw data and so without requiring an explicit feature extraction process. These models are proposed for real-time activity recognition, and they are tested on the CAD-60 dataset. Results show that the proposed model behaves better than an LSTM model thanks to the automatic features extraction of the limbs’ correlation. “New Person” results show that the CNN-LSTM model achieves 95.4% of precision and 94.4% of recall, while the “Have Seen” results are 96.1% of precision and 94.7% of recall.

Highlights

Personal service robotics applications are already available on the market to be used in human-populated environments such as working, public and domestic ones
We investigate the possibility of training the recognition module considering both spatial dependencies due to the relationships among the RGB-D skeleton joints by the use of a convolutional neural network (CNN) and the temporal patterns of the activities by the use of an LSTM
Unlike the approaches applied on the CAD-60 dataset that select and extract features manually, we propose a deep learning model for automatic feature extraction that uses CNNs to extract spatial dependencies from human poses and LSTMs to extract temporal dependencies between poses

Summary

Introduction

Personal service robotics applications are already available on the market to be used in human-populated environments such as working, public and domestic ones. Taking inspiration from the work of [6,20], where the authors propose a spatiotemporal classification, respectively, for video description from images and activity recognition from wearable devices data, here we aim at achieving the same results by combining the use of CNNs with LSTM gaining benefits of both spatial and temporal learning. Following this idea, we investigate the possibility of training the recognition module considering both spatial dependencies due to the relationships among the RGB-D skeleton joints by the use of a CNN and the temporal patterns of the activities by the use of an LSTM. When comparing the performance on the whole duration of a video the approach performs as the other state-of-the-art approaches

Related works

The proposed approach

Experimental evaluation

Dataset

Data preprocessing

Model settings

Classification results

The code is available upon request

LSTM results

CNN-LSTM results

Statistical hypothesis test

Window size results

Comparison with the SoA

Real setting configuration

Findings

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Combining CNN and LSTM for activity of daily living recognition with a 3D matrix skeleton representation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Intelligent Service Robotics

Lead the way for us

Similar Papers

A Survey on 3D Skeleton-Based Action Recognition Using Learning Method.
Bin Ren ... Runwei Ding
Cyborg and bionic systems (Washington, D.C.) | VOL. 5
Bin Ren, et. al.Bin Ren ... Runwei Ding
29 Jan 2024
Cyborg and bionic systems (Washington, D.C.) | VOL. 5

Deep learning‐based action recognition with 3D skeleton: A survey
Yuling Xing ... Jia Zhu
CAAI Transactions on Intelligence Technology | VOL. 6
Yuling Xing, et. al.Yuling Xing ... Jia Zhu
01 Mar 2021
CAAI Transactions on Intelligence Technology | VOL. 6

Understanding the limits of 2D skeletons for action recognition
Petr Elias ... Jan Sedmidubsky
Multimedia Systems | VOL. 27
Petr Elias, et. al.Petr Elias ... Jan Sedmidubsky
07 Feb 2021
Multimedia Systems | VOL. 27

Research on Forest Phenology Prediction Based on LSTM and GRU Model
Guan Peng ... Zheng Yili
Journal of Resources and Ecology | VOL. 14
Guan Peng, et. al.Guan Peng ... Zheng Yili
12 Dec 2022
Journal of Resources and Ecology | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Combining CNN and LSTM for activity of daily living recognition with a 3D matrix skeleton representation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Intelligent Service Robotics