Adaptive Recursive Circle Framework for Fine-Grained Action Recognition

Hanxi Lin; Wentian Zhao; Xinxiao Wu

doi:10.1109/ICME52920.2022.9859982

Adaptive Recursive Circle Framework for Fine-Grained Action Recognition

Hanxi Lin, Wentian Zhao, Xinxiao Wu^*

^*Corresponding author for this work

School of Computer Science and Technology

Beijing Institute of Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

Intuitively, distinguishing fine-grained actions in videos requires recursively capturing subtle visual cues and learning abstract features. However, existing deep neural network based methods are counter-intuitive in that their network layers do not explicitly model the recursive feature abstraction. Therefore, we are motivated to propose an Adaptive Recursive Circle (ARC) framework that equips common neural network layers with recursive attention and recursive fusion. ARC layer inherits the same operators and parameters as the original layer, but, most critically, it treats the layer input as an evolving state, thus explicitly achieving recursive feature abstraction by alternating the state update and the feature generation. Specifically, at each recursive step, the input state is firstly updated via both recursive attention and recursive fusion from the previously generated features, and then the feature abstraction is performed with the newly updated input state. Significant improvements are observed on multiple datasets. For example, an ARC-equipped TSM-ResNet-18 outperforms TSM-ResNet-50 on the Something-Something V1 and Diving48 datasets with only half over-heads. Code will be available at: https://github.com/0HaNC/ARC-ActionRecog.

Original language	English
Title of host publication	ICME 2022 - IEEE International Conference on Multimedia and Expo 2022, Proceedings
Publisher	IEEE Computer Society
ISBN (Electronic)	9781665485630
DOIs	https://doi.org/10.1109/ICME52920.2022.9859982
Publication status	Published - 2022
Event	2022 IEEE International Conference on Multimedia and Expo, ICME 2022 - Taipei, Taiwan, Province of China Duration: 18 Jul 2022 → 22 Jul 2022

Publication series

Name	Proceedings - IEEE International Conference on Multimedia and Expo
Volume	2022-July
ISSN (Print)	1945-7871
ISSN (Electronic)	1945-788X

Conference

Conference	2022 IEEE International Conference on Multimedia and Expo, ICME 2022
Country/Territory	Taiwan, Province of China
City	Taipei
Period	18/07/22 → 22/07/22

Keywords

fine-grained action recognition
recursive representation
representation learning
visual reasoning

Access to Document

10.1109/ICME52920.2022.9859982

Cite this

@inproceedings{d78e17e5d3dd45f081840e7e03bb67ec,

title = "Adaptive Recursive Circle Framework for Fine-Grained Action Recognition",

abstract = "Intuitively, distinguishing fine-grained actions in videos requires recursively capturing subtle visual cues and learning abstract features. However, existing deep neural network based methods are counter-intuitive in that their network layers do not explicitly model the recursive feature abstraction. Therefore, we are motivated to propose an Adaptive Recursive Circle (ARC) framework that equips common neural network layers with recursive attention and recursive fusion. ARC layer inherits the same operators and parameters as the original layer, but, most critically, it treats the layer input as an evolving state, thus explicitly achieving recursive feature abstraction by alternating the state update and the feature generation. Specifically, at each recursive step, the input state is firstly updated via both recursive attention and recursive fusion from the previously generated features, and then the feature abstraction is performed with the newly updated input state. Significant improvements are observed on multiple datasets. For example, an ARC-equipped TSM-ResNet-18 outperforms TSM-ResNet-50 on the Something-Something V1 and Diving48 datasets with only half over-heads. Code will be available at: https://github.com/0HaNC/ARC-ActionRecog.",

keywords = "fine-grained action recognition, recursive representation, representation learning, visual reasoning",

author = "Hanxi Lin and Wentian Zhao and Xinxiao Wu",

note = "Publisher Copyright: {\textcopyright} 2022 IEEE.; 2022 IEEE International Conference on Multimedia and Expo, ICME 2022 ; Conference date: 18-07-2022 Through 22-07-2022",

year = "2022",

doi = "10.1109/ICME52920.2022.9859982",

language = "English",

series = "Proceedings - IEEE International Conference on Multimedia and Expo",

publisher = "IEEE Computer Society",

booktitle = "ICME 2022 - IEEE International Conference on Multimedia and Expo 2022, Proceedings",

address = "United States",

}

Lin, H, Zhao, W & Wu, X 2022, Adaptive Recursive Circle Framework for Fine-Grained Action Recognition. in ICME 2022 - IEEE International Conference on Multimedia and Expo 2022, Proceedings. Proceedings - IEEE International Conference on Multimedia and Expo, vol. 2022-July, IEEE Computer Society, 2022 IEEE International Conference on Multimedia and Expo, ICME 2022, Taipei, Taiwan, Province of China, 18/07/22. https://doi.org/10.1109/ICME52920.2022.9859982

Adaptive Recursive Circle Framework for Fine-Grained Action Recognition. / Lin, Hanxi; Zhao, Wentian; Wu, Xinxiao.
ICME 2022 - IEEE International Conference on Multimedia and Expo 2022, Proceedings. IEEE Computer Society, 2022. (Proceedings - IEEE International Conference on Multimedia and Expo; Vol. 2022-July).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Adaptive Recursive Circle Framework for Fine-Grained Action Recognition

AU - Lin, Hanxi

AU - Zhao, Wentian

AU - Wu, Xinxiao

PY - 2022

Y1 - 2022

N2 - Intuitively, distinguishing fine-grained actions in videos requires recursively capturing subtle visual cues and learning abstract features. However, existing deep neural network based methods are counter-intuitive in that their network layers do not explicitly model the recursive feature abstraction. Therefore, we are motivated to propose an Adaptive Recursive Circle (ARC) framework that equips common neural network layers with recursive attention and recursive fusion. ARC layer inherits the same operators and parameters as the original layer, but, most critically, it treats the layer input as an evolving state, thus explicitly achieving recursive feature abstraction by alternating the state update and the feature generation. Specifically, at each recursive step, the input state is firstly updated via both recursive attention and recursive fusion from the previously generated features, and then the feature abstraction is performed with the newly updated input state. Significant improvements are observed on multiple datasets. For example, an ARC-equipped TSM-ResNet-18 outperforms TSM-ResNet-50 on the Something-Something V1 and Diving48 datasets with only half over-heads. Code will be available at: https://github.com/0HaNC/ARC-ActionRecog.

AB - Intuitively, distinguishing fine-grained actions in videos requires recursively capturing subtle visual cues and learning abstract features. However, existing deep neural network based methods are counter-intuitive in that their network layers do not explicitly model the recursive feature abstraction. Therefore, we are motivated to propose an Adaptive Recursive Circle (ARC) framework that equips common neural network layers with recursive attention and recursive fusion. ARC layer inherits the same operators and parameters as the original layer, but, most critically, it treats the layer input as an evolving state, thus explicitly achieving recursive feature abstraction by alternating the state update and the feature generation. Specifically, at each recursive step, the input state is firstly updated via both recursive attention and recursive fusion from the previously generated features, and then the feature abstraction is performed with the newly updated input state. Significant improvements are observed on multiple datasets. For example, an ARC-equipped TSM-ResNet-18 outperforms TSM-ResNet-50 on the Something-Something V1 and Diving48 datasets with only half over-heads. Code will be available at: https://github.com/0HaNC/ARC-ActionRecog.

KW - fine-grained action recognition

KW - recursive representation

KW - representation learning

KW - visual reasoning

UR - http://www.scopus.com/inward/record.url?scp=85137739490&partnerID=8YFLogxK

U2 - 10.1109/ICME52920.2022.9859982

DO - 10.1109/ICME52920.2022.9859982

M3 - Conference contribution

AN - SCOPUS:85137739490

T3 - Proceedings - IEEE International Conference on Multimedia and Expo

BT - ICME 2022 - IEEE International Conference on Multimedia and Expo 2022, Proceedings

PB - IEEE Computer Society

T2 - 2022 IEEE International Conference on Multimedia and Expo, ICME 2022

Y2 - 18 July 2022 through 22 July 2022

ER -

Adaptive Recursive Circle Framework for Fine-Grained Action Recognition

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this