Abstract

Smart homes have become central in the sustainability of buildings. Recognizing human activity in smart homes is the key tool to achieve home automation. Recently, two-stream Convolutional Neural Networks (CNNs) have shown promising performance for video-based human action recognition. However, such models cannot act directly on the 3D skeletal sequences due to its limitation to the 2D image video inputs. Considering the powerful effect of 3D skeletal data for describing human activity, in this study, we present a novel method to recognize the skeletal human activity in sustainable smart homes using a CNN fusion model. Our proposed method can represent the spatiotemporal information of each 3D skeletal sequence into three images and three image sequences through gray value encoding, referred to as skeletal trajectory shape images (STSIs) and skeletal pose image (SPI) sequences, and build a CNNs’ fusion model with three STSIs and three SPI sequences as input for skeletal activity recognition. Such three STSIs and three SPI sequences are, respectively, generated in three orthogonal planes as complementary to each other. The proposed CNN fusion model allows the hierarchical learning of spatiotemporal features, offering better action recognition performance. Experimental results on three public datasets show that our method outperforms the state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.