Deeply-supervised CNN model for action recognition with trainable feature aggregation

Yang Li, Kan Li*, Xinxin Wang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

4 引用 (Scopus)

摘要

In this paper, we propose a deeply-supervised CNN model for action recognition that fully exploits powerful hierarchical features of CNNs. In this model, we build multi-level video representations by applying our proposed aggregation module at different convolutional layers. Moreover, we train this model in a deep supervision manner, which brings improvement in both performance and efficiency. Meanwhile, in order to capture the temporal structure as well as preserve more details about actions, we propose a trainable aggregation module. It models the temporal evolution of each spatial location and projects them into a semantic space using the Vector of Locally Aggregated Descriptors (VLAD) technique. This deeply-supervised CNN model integrating the powerful aggregation module provides a promising solution to recognize actions in videos. We conduct experiments on two action recognition datasets: HMDB51 and UCF101. Results show that our model outperforms the state-of-the-art methods.

源语言英语
主期刊名Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018
编辑Jerome Lang
出版商International Joint Conferences on Artificial Intelligence
807-813
页数7
ISBN(电子版)9780999241127
DOI
出版状态已出版 - 2018
活动27th International Joint Conference on Artificial Intelligence, IJCAI 2018 - Stockholm, 瑞典
期限: 13 7月 201819 7月 2018

出版系列

姓名IJCAI International Joint Conference on Artificial Intelligence
2018-July
ISSN(印刷版)1045-0823

会议

会议27th International Joint Conference on Artificial Intelligence, IJCAI 2018
国家/地区瑞典
Stockholm
时期13/07/1819/07/18

指纹

探究 'Deeply-supervised CNN model for action recognition with trainable feature aggregation' 的科研主题。它们共同构成独一无二的指纹。

引用此