What and Where to See: Deep Attention Aggregation Network for Action Detection

Yuxuan He; Ming Gang Gan; Xiaozhou Liu

doi:10.1007/978-3-031-13844-7_18

What and Where to See: Deep Attention Aggregation Network for Action Detection

Yuxuan He, Ming Gang Gan^*, Xiaozhou Liu

^*此作品的通讯作者

自动化学院

Beijing Institute of Technology

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

摘要

With the development of deep convolutional neural networks, 2D CNN is widely used in action detection task. Although 2D CNN extracts rich features from video frames, these features also contain redundant information. In response to this problem, we propose Residual Channel-Spatial Attention module (RCSA) to guide the network what (object patterns) and where (spatially) need to be focused. Meanwhile, in order to effectively utilize the rich spatial and semantic features extracted by different layers of deep networks, we combine RCSA and deep aggregation network to propose Deep Attention Aggregation Network. Experiment resultes on two datasets J-HMDB and UCF-101 show that the proposed network achieves state-of-the-art performances on action detection.

源语言	英语
主期刊名	Intelligent Robotics and Applications - 15th International Conference, ICIRA 2022, Proceedings
编辑	Honghai Liu, Weihong Ren, Zhouping Yin, Lianqing Liu, Li Jiang, Guoying Gu, Xinyu Wu
出版商	Springer Science and Business Media Deutschland GmbH
页	177-187
页数	11
ISBN（印刷版）	9783031138430
DOI	https://doi.org/10.1007/978-3-031-13844-7_18
出版状态	已出版 - 2022
活动	15th International Conference on Intelligent Robotics and Applications, ICIRA 2022 - Harbin, 中国期限: 1 8月 2022 → 3 8月 2022

出版系列

姓名	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
卷	13455 LNAI
ISSN（印刷版）	0302-9743
ISSN（电子版）	1611-3349

会议

会议	15th International Conference on Intelligent Robotics and Applications, ICIRA 2022
国家/地区	中国
市	Harbin
时期	1/08/22 → 3/08/22

访问文件

10.1007/978-3-031-13844-7_18

其它文件与链接

链接到 Scopus 的出版物

引用此

He, Y., Gan, M. G., & Liu, X. (2022). What and Where to See: Deep Attention Aggregation Network for Action Detection. 在 H. Liu, W. Ren, Z. Yin, L. Liu, L. Jiang, G. Gu, & X. Wu (编辑), Intelligent Robotics and Applications - 15th International Conference, ICIRA 2022, Proceedings (页码 177-187). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 13455 LNAI). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-13844-7_18

He, Yuxuan ; Gan, Ming Gang ; Liu, Xiaozhou. / What and Where to See : Deep Attention Aggregation Network for Action Detection. Intelligent Robotics and Applications - 15th International Conference, ICIRA 2022, Proceedings. 编辑 / Honghai Liu ; Weihong Ren ; Zhouping Yin ; Lianqing Liu ; Li Jiang ; Guoying Gu ; Xinyu Wu. Springer Science and Business Media Deutschland GmbH, 2022. 页码 177-187 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{fb016e2763014e7784767e61e4534061,

title = "What and Where to See: Deep Attention Aggregation Network for Action Detection",

abstract = "With the development of deep convolutional neural networks, 2D CNN is widely used in action detection task. Although 2D CNN extracts rich features from video frames, these features also contain redundant information. In response to this problem, we propose Residual Channel-Spatial Attention module (RCSA) to guide the network what (object patterns) and where (spatially) need to be focused. Meanwhile, in order to effectively utilize the rich spatial and semantic features extracted by different layers of deep networks, we combine RCSA and deep aggregation network to propose Deep Attention Aggregation Network. Experiment resultes on two datasets J-HMDB and UCF-101 show that the proposed network achieves state-of-the-art performances on action detection.",

keywords = "Action detection, Deep neural network, Feature aggregation, Residual channel-spatial attention",

author = "Yuxuan He and Gan, {Ming Gang} and Xiaozhou Liu",

note = "Publisher Copyright: {\textcopyright} 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.; 15th International Conference on Intelligent Robotics and Applications, ICIRA 2022 ; Conference date: 01-08-2022 Through 03-08-2022",

year = "2022",

doi = "10.1007/978-3-031-13844-7_18",

language = "English",

isbn = "9783031138430",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "177--187",

editor = "Honghai Liu and Weihong Ren and Zhouping Yin and Lianqing Liu and Li Jiang and Guoying Gu and Xinyu Wu",

booktitle = "Intelligent Robotics and Applications - 15th International Conference, ICIRA 2022, Proceedings",

address = "Germany",

}

He, Y, Gan, MG & Liu, X 2022, What and Where to See: Deep Attention Aggregation Network for Action Detection. 在 H Liu, W Ren, Z Yin, L Liu, L Jiang, G Gu & X Wu (编辑), Intelligent Robotics and Applications - 15th International Conference, ICIRA 2022, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 卷 13455 LNAI, Springer Science and Business Media Deutschland GmbH, 页码 177-187, 15th International Conference on Intelligent Robotics and Applications, ICIRA 2022, Harbin, 中国, 1/08/22. https://doi.org/10.1007/978-3-031-13844-7_18

What and Where to See: Deep Attention Aggregation Network for Action Detection. / He, Yuxuan; Gan, Ming Gang; Liu, Xiaozhou.
Intelligent Robotics and Applications - 15th International Conference, ICIRA 2022, Proceedings. 编辑 / Honghai Liu; Weihong Ren; Zhouping Yin; Lianqing Liu; Li Jiang; Guoying Gu; Xinyu Wu. Springer Science and Business Media Deutschland GmbH, 2022. 页码 177-187 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 13455 LNAI).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - What and Where to See

T2 - 15th International Conference on Intelligent Robotics and Applications, ICIRA 2022

AU - He, Yuxuan

AU - Gan, Ming Gang

AU - Liu, Xiaozhou

PY - 2022

Y1 - 2022

N2 - With the development of deep convolutional neural networks, 2D CNN is widely used in action detection task. Although 2D CNN extracts rich features from video frames, these features also contain redundant information. In response to this problem, we propose Residual Channel-Spatial Attention module (RCSA) to guide the network what (object patterns) and where (spatially) need to be focused. Meanwhile, in order to effectively utilize the rich spatial and semantic features extracted by different layers of deep networks, we combine RCSA and deep aggregation network to propose Deep Attention Aggregation Network. Experiment resultes on two datasets J-HMDB and UCF-101 show that the proposed network achieves state-of-the-art performances on action detection.

AB - With the development of deep convolutional neural networks, 2D CNN is widely used in action detection task. Although 2D CNN extracts rich features from video frames, these features also contain redundant information. In response to this problem, we propose Residual Channel-Spatial Attention module (RCSA) to guide the network what (object patterns) and where (spatially) need to be focused. Meanwhile, in order to effectively utilize the rich spatial and semantic features extracted by different layers of deep networks, we combine RCSA and deep aggregation network to propose Deep Attention Aggregation Network. Experiment resultes on two datasets J-HMDB and UCF-101 show that the proposed network achieves state-of-the-art performances on action detection.

KW - Action detection

KW - Deep neural network

KW - Feature aggregation

KW - Residual channel-spatial attention

UR - http://www.scopus.com/inward/record.url?scp=85135774499&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-13844-7_18

DO - 10.1007/978-3-031-13844-7_18

M3 - Conference contribution

AN - SCOPUS:85135774499

SN - 9783031138430

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 177

EP - 187

BT - Intelligent Robotics and Applications - 15th International Conference, ICIRA 2022, Proceedings

A2 - Liu, Honghai

A2 - Ren, Weihong

A2 - Yin, Zhouping

A2 - Liu, Lianqing

A2 - Jiang, Li

A2 - Gu, Guoying

A2 - Wu, Xinyu

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 1 August 2022 through 3 August 2022

ER -

He Y, Gan MG, Liu X. What and Where to See: Deep Attention Aggregation Network for Action Detection. 在 Liu H, Ren W, Yin Z, Liu L, Jiang L, Gu G, Wu X, 编辑, Intelligent Robotics and Applications - 15th International Conference, ICIRA 2022, Proceedings. Springer Science and Business Media Deutschland GmbH. 2022. 页码 177-187. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-13844-7_18

What and Where to See: Deep Attention Aggregation Network for Action Detection

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此