Abstract

This paper proposes a novel multimodal data classifier named Center Boundary Balancing Multimodal Classifier (CBBMC) to fuse and classify the spatial and temporal descriptors for recognizing human actions from depth video sequences. CBBMC is a composite algorithm integrating feature fusion and feature classification, in which Center Boundary Balancing Projection (CBBP) is used to balance the center and boundary information of feature class spaces. In order to solve the problem of multimodal information redundancy and isolation, two feature selection and fusion schemes of CBBMC based on embedded feature selection are presented. Moreover, two new action descriptors called Gaussian Pyramid Depth Motion Images (GP-DMI) and Depth Temporal Maps (DTM) are introduced to capture the multi-scale spatial and fine-grained temporal information of human activities. Finally, we present an effective spatial and temporal information fusion framework based on CBBMC for human action recognition. In order to evaluate the performance of the proposed approach, extensive experiments are conducted. The proposed method achieved impressive results on four benchmark datasets, namely MSR Action3D (96.33%), UTD-MHAD (94.41%), DHA (95.65%), and NTU RGB+D (83.31% cross-subject and 87.66% cross-view), even though only the depth modality was used. The experimental results demonstrate the effectiveness of our method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call