What and Where to See: Deep Attention Aggregation Network for Action Detection

Yuxuan He, Ming Gang Gan*, Xiaozhou Liu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

With the development of deep convolutional neural networks, 2D CNN is widely used in action detection task. Although 2D CNN extracts rich features from video frames, these features also contain redundant information. In response to this problem, we propose Residual Channel-Spatial Attention module (RCSA) to guide the network what (object patterns) and where (spatially) need to be focused. Meanwhile, in order to effectively utilize the rich spatial and semantic features extracted by different layers of deep networks, we combine RCSA and deep aggregation network to propose Deep Attention Aggregation Network. Experiment resultes on two datasets J-HMDB and UCF-101 show that the proposed network achieves state-of-the-art performances on action detection.

Original languageEnglish
Title of host publicationIntelligent Robotics and Applications - 15th International Conference, ICIRA 2022, Proceedings
EditorsHonghai Liu, Weihong Ren, Zhouping Yin, Lianqing Liu, Li Jiang, Guoying Gu, Xinyu Wu
PublisherSpringer Science and Business Media Deutschland GmbH
Pages177-187
Number of pages11
ISBN (Print)9783031138430
DOIs
Publication statusPublished - 2022
Event15th International Conference on Intelligent Robotics and Applications, ICIRA 2022 - Harbin, China
Duration: 1 Aug 20223 Aug 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13455 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th International Conference on Intelligent Robotics and Applications, ICIRA 2022
Country/TerritoryChina
CityHarbin
Period1/08/223/08/22

Keywords

  • Action detection
  • Deep neural network
  • Feature aggregation
  • Residual channel-spatial attention

Fingerprint

Dive into the research topics of 'What and Where to See: Deep Attention Aggregation Network for Action Detection'. Together they form a unique fingerprint.

Cite this