Temporal-visual proposal graph network for temporal action detection

Ming Gang Gan, Yan Zhang*, Shaowen Su

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Temporal action detection is usually divided into two stages: temporal action proposal generation and proposal classification. Most methods consider the proposal classification stage as an action recognition task. However, compared with trimmed videos, proposals generally contain part of the ground-truth action, lacking enough semantic information to predict their categories precisely. In this paper, we propose a novel temporal-visual proposal graph (TVPG) module to acquire sufficient semantic information for action proposal classification. The module first adopts a proposal graph construction strategy to select valuable neighbor proposals for each proposal and constructs them into an action proposal graph. Then, it applies a temporal graph convolution network and a visual graph convolution network in parallel on the graph to improve proposal feature quality by obtaining action information from neighbors. In the temporal graph convolution network, we design a novel temporal graph convolution operation that embeds temporal position relation information into proposal features and extracts the information from other proposals by temporal position relations. Based on the TVPG module, we construct an action proposal classification model named the temporal-visual proposal graph network (TVPGN) and perform extensive experiments on two benchmarks. The results show that TVPGN achieves competitive performance on both datasets.

源语言英语
页(从-至)26008-26026
页数19
期刊Applied Intelligence
53
21
DOI
出版状态已出版 - 11月 2023

指纹

探究 'Temporal-visual proposal graph network for temporal action detection' 的科研主题。它们共同构成独一无二的指纹。

引用此