Spatio-temporal attention mechanisms based model for collective activity recognition

Lihua Lu, Huijun Di, Yao Lu*, Lin Zhang, Shunzhou Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

17 Citations (Scopus)

Abstract

Collective activity recognition involving multiple people active and interactive in a collective scenario is a widely-used but challenging domain in computer vision. The key to this end task is how to efficiently explore the spatial and temporal evolutions of the collective activities. In this paper we propose a spatio-temporal attention mechanisms based model to exploit spatial configurations and temporal dynamics in collective scenes. We present ingenious spatio-temporal attention mechanisms built from both deep RGB features and human articulated poses to capture spatio-temporal evolutions of individuals’ actions and the collective activity. Benefited from these attention mechanisms, our model learns to spatially capture unbalanced person–group interactions for each person while updating each individual state based on these interactions, and temporally assess reliabilities of different video frames to predict the final label of the collective activity. Furthermore, the long-range temporal variability and consistency are handled by a two-stage Gated Recurrent Units (GRUs) network. Finally, to ensure effective training of our model, we jointly optimize the losses at both person and group levels to drive the model learning process. Experimental results indicate that our method outperforms the state-of-the-art on Volleyball dataset. More check experiments and visual results demonstrate the effectiveness and practicability of the proposed model.

Original languageEnglish
Pages (from-to)162-174
Number of pages13
JournalSignal Processing: Image Communication
Volume74
DOIs
Publication statusPublished - May 2019

Keywords

  • Attention mechanisms
  • Gated Recurrent Units (GRUs) network
  • Multi-modal data
  • Multi-person activity recognition
  • Spatio-temporal model

Fingerprint

Dive into the research topics of 'Spatio-temporal attention mechanisms based model for collective activity recognition'. Together they form a unique fingerprint.

Cite this