Abstract

Group activity recognition from videos is a very challenging problem that has barely been addressed. We propose an activity recognition method using group context. In order to encode both single-person description and two-person interactions, we learn mappings from highdimensional feature spaces to low-dimensional dictionaries. In particular the proposed two-person descriptor takes into account geometric characteristics of the relative pose and motion between the two persons. Both single-person and two-person representations are then used to define unary and pairwise potentials of an energy function, whose optimization leads to the structured labeling of persons involved in the same activity. An interesting feature of the proposed method is that, unlike the vast majority of existing methods, it is able to recognize multiple distinct group activities occurring simultaneously in a video. The proposed method is evaluated with datasets widely used for group activity recognition, and is compared with several baseline methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call