Abstract

Human group activity recognition (GAR) has attracted significant attention from computer vision researchers due to its wide practical applications in security surveillance, social role understanding and sports video analysis. In this paper, we give a comprehensive overview of the advances in group activity recognition in videos during the past 20 years. First, we provide a summary and comparison of 11 GAR video datasets in this field. Second, we survey the group activity recognition methods, including those based on handcrafted features and those based on deep learning networks. For better understanding of the pros and cons of these methods, we compare various models from the past to the present. Finally, we outline several challenging issues and possible directions for future research. From this comprehensive literature review, readers can obtain an overview of progress in group activity recognition for future studies.

Highlights

  • In recent years, the widespread applications of surveillance equipment have rapidly increased the amount of video data

  • We focus on group activity which is composed of one or more sub-groups involving visually countable persons with interactions in the scene

  • This paper provides a comprehensive survey of current group activity recognition methods

Read more

Summary

Introduction

The widespread applications of surveillance equipment have rapidly increased the amount of video data. Group activity enables us to capture detailed information about individuals as well as their interactions, which is more explained and makes sense in practice. To the best of our knowledge, the most recent survey related to group activity recognition is published in 2017[7]. It focuses mainly on handcrafted based methods while deep learning based methods are not discussed in depth. An overview of group activity recognition methods including the state-ofthe-art in recent years is required. Our survey introduces sufficient latest works and discusses recent research trends in group activity recognition.

Datasets
Surveillance datasets
Sports datasets
Approaches based on handcrafted features
Top-down approach
Bottom-up approach
Interaction context
Deep learning based methods
Hierarchical temporal modeling
Deep relationship modeling
Attention modeling
Unified modeling framework
Challenges and trends
Traditional methods
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call