TY - JOUR
T1 - GAIM
T2 - Graph Attention Interaction Model for Collective Activity Recognition
AU - Lu, Lihua
AU - Lu, Yao
AU - Yu, Ruizhe
AU - Di, Huijun
AU - Zhang, Lin
AU - Wang, Shunzhou
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2020/2
Y1 - 2020/2
N2 - Unbalanced interaction relationships at personal and group levels play a pivotal role in collective activity recognition, which has not been adaptively and jointly explored by previous approaches. In this paper, we propose a graph attention interaction model (GAIM) embedded with the graph attention block (GAB) to explicitly and adaptively infer unbalanced interaction relations at personal and group levels in a unified architecture, and further to learn the spatial and temporal evolutions of the collective activity from these interactions to predict the activity labels. We first design the spatiotemporal graphs tailored to the collective activity where the concurrent person and group nodes, respectively, represent individuals' actions and the collective activity. The graphs provide both spatial structures and semantic appearance features for the collective activity. Then, GAB performs convolution-like filters on the graphs to infer unequal and two-level interaction relations in the collective activity by implementing graph convolutional networks with a shared attention mechanism. At the personal level, the GAB learns different levels of interactions for each person node from its neighbor person nodes under the guidance from the group node. At the group level, the GAB assesses various degrees of interactions to the group node contributed by person nodes. Equipped with the GRUs network, the GAIM learns the spatial and temporal evolutions of individuals' actions as well as the collective activity from the captured interactions, and finally predicts the label of the collective activity. Experiments on four publicly available datasets and ablation studies are conducted to evaluate the performance of our GAIM, and the improved performance demonstrates the effectiveness of our model.
AB - Unbalanced interaction relationships at personal and group levels play a pivotal role in collective activity recognition, which has not been adaptively and jointly explored by previous approaches. In this paper, we propose a graph attention interaction model (GAIM) embedded with the graph attention block (GAB) to explicitly and adaptively infer unbalanced interaction relations at personal and group levels in a unified architecture, and further to learn the spatial and temporal evolutions of the collective activity from these interactions to predict the activity labels. We first design the spatiotemporal graphs tailored to the collective activity where the concurrent person and group nodes, respectively, represent individuals' actions and the collective activity. The graphs provide both spatial structures and semantic appearance features for the collective activity. Then, GAB performs convolution-like filters on the graphs to infer unequal and two-level interaction relations in the collective activity by implementing graph convolutional networks with a shared attention mechanism. At the personal level, the GAB learns different levels of interactions for each person node from its neighbor person nodes under the guidance from the group node. At the group level, the GAB assesses various degrees of interactions to the group node contributed by person nodes. Equipped with the GRUs network, the GAIM learns the spatial and temporal evolutions of individuals' actions as well as the collective activity from the captured interactions, and finally predicts the label of the collective activity. Experiments on four publicly available datasets and ablation studies are conducted to evaluate the performance of our GAIM, and the improved performance demonstrates the effectiveness of our model.
KW - Attention Mechanisms
KW - Collective Activity Recognition
KW - Graph Convolutional Networks
KW - Unbalanced and Two-Level Interactions
UR - http://www.scopus.com/inward/record.url?scp=85079595207&partnerID=8YFLogxK
U2 - 10.1109/TMM.2019.2930344
DO - 10.1109/TMM.2019.2930344
M3 - Article
AN - SCOPUS:85079595207
SN - 1520-9210
VL - 22
SP - 524
EP - 539
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
IS - 2
M1 - 8769904
ER -