Abstract

Unbalanced interaction relationships at personal and group levels play a pivotal role in collective activity recognition, which has not been adaptively and jointly explored by previous approaches. In this paper, we propose a graph attention interaction model (GAIM) embedded with the graph attention block (GAB) to explicitly and adaptively infer unbalanced interaction relations at personal and group levels in a unified architecture, and further to learn the spatial and temporal evolutions of the collective activity from these interactions to predict the activity labels. We first design the spatiotemporal graphs tailored to the collective activity where the concurrent person and group nodes, respectively, represent individuals’ actions and the collective activity. The graphs provide both spatial structures and semantic appearance features for the collective activity. Then, GAB performs convolution-like filters on the graphs to infer unequal and two-level interaction relations in the collective activity by implementing graph convolutional networks with a shared attention mechanism. At the personal level, the GAB learns different levels of interactions for each person node from its neighbor person nodes under the guidance from the group node. At the group level, the GAB assesses various degrees of interactions to the group node contributed by person nodes. Equipped with the GRUs network, the GAIM learns the spatial and temporal evolutions of individuals’ actions as well as the collective activity from the captured interactions, and finally predicts the label of the collective activity. Experiments on four publicly available datasets and ablation studies are conducted to evaluate the performance of our GAIM, and the improved performance demonstrates the effectiveness of our model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call