Abstract

Group activity recognition is a challenging task because there is an exponentially large number of semantic and geometrical relationships among individuals. This makes it difficult to model these interactions and merge them as a whole for group activity classification. In this paper, we propose a deep fully-connected model for group recognition, first we use the spatial-temporal model based on convolution neural network (CNN) and long short-term memory networks (LSTM) network to capture the dynamic features of each person. Then, we use the fully-connected conditional random field (FCCRF) to learn the interactions between people. Finally, with FCCRF potential functions we re-fine the activity recognition predicted by the spatial-temporal model. The experimental results on collective activity data-set and collective activity extended data-set show that our model improves recognition accuracy over baseline methods and gets competitive results in comparison to the state-of-the-art models.

Highlights

  • In recent years, vision-based human activity recognition has become a hot research direction

  • The main contributions of this paper are as follows: (1) Proposing a graphical framework based on deep learning network to simulate the interactions between people in group activity; (2) Using fully-connected Conditional Random Field model to correct the prediction errors generated by the deep learning network; The remaining contents are organized as below, in section 2 we review the related work of group activity recognition; in section 3 we describe the conditional random field model based on deep learning network; behavior classifications are analyzed in section 4; model training is briefly introduced in part 5; in section 6 we present the experimental results analysis and compare them to other models

  • After the video image is processed through the spatial-temporal model based on convolution neural network (CNN) and long short-term memory networks (LSTM) network, the output obtained contains the preliminary observation information and behavior category of each person in the image

Read more

Summary

INTRODUCTION

Vision-based human activity recognition has become a hot research direction. Using CNN and LSTM network to obtain the dynamic information of the single-person behavior as well as the preliminary prediction of the group activity; using Conditional Random Field graphical model to describe the interactions between people in the group; integrating the knowledge from both CRF and deep learning network to refine the individual and group activity recognition. The main contributions of this paper are as follows: (1) Proposing a graphical framework based on deep learning network to simulate the interactions between people in group activity; (2) Using fully-connected Conditional Random Field model to correct the prediction errors generated by the deep learning network; The remaining contents are organized as below, in section 2 we review the related work of group activity recognition; in section 3 we describe the conditional random field model based on deep learning network; behavior classifications are analyzed in section 4; model training is briefly introduced in part 5; in section 6 we present the experimental results analysis and compare them to other models

RELATED WORK
FULLY CONNECTED CONDITIONAL RANDOM FIELD MODEL
STRENGTH VISUALIZATION OF THE FULLY CONNECTED RELATIONSHIP
BEHAVIOR PREDICTIONS
ALGORITHM TRAINING
EXPERIMENTAL CLASSIFICATION RESULTS AND ANALYSIS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.