Temporal-visual proposal graph network for temporal action detection

Ming Gang Gan; Yan Zhang; Shaowen Su

doi:10.1007/s10489-023-04947-0

Temporal-visual proposal graph network for temporal action detection

Ming Gang Gan, Yan Zhang^*, Shaowen Su

^*此作品的通讯作者

自动化学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Temporal action detection is usually divided into two stages: temporal action proposal generation and proposal classification. Most methods consider the proposal classification stage as an action recognition task. However, compared with trimmed videos, proposals generally contain part of the ground-truth action, lacking enough semantic information to predict their categories precisely. In this paper, we propose a novel temporal-visual proposal graph (TVPG) module to acquire sufficient semantic information for action proposal classification. The module first adopts a proposal graph construction strategy to select valuable neighbor proposals for each proposal and constructs them into an action proposal graph. Then, it applies a temporal graph convolution network and a visual graph convolution network in parallel on the graph to improve proposal feature quality by obtaining action information from neighbors. In the temporal graph convolution network, we design a novel temporal graph convolution operation that embeds temporal position relation information into proposal features and extracts the information from other proposals by temporal position relations. Based on the TVPG module, we construct an action proposal classification model named the temporal-visual proposal graph network (TVPGN) and perform extensive experiments on two benchmarks. The results show that TVPGN achieves competitive performance on both datasets.

源语言	英语
页（从-至）	26008-26026
页数	19
期刊	Applied Intelligence
卷	53
期	21
DOI	https://doi.org/10.1007/s10489-023-04947-0
出版状态	已出版 - 11月 2023

访问文件

10.1007/s10489-023-04947-0

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{2b4290305c1e4901a2cb77522cc66454,

title = "Temporal-visual proposal graph network for temporal action detection",

abstract = "Temporal action detection is usually divided into two stages: temporal action proposal generation and proposal classification. Most methods consider the proposal classification stage as an action recognition task. However, compared with trimmed videos, proposals generally contain part of the ground-truth action, lacking enough semantic information to predict their categories precisely. In this paper, we propose a novel temporal-visual proposal graph (TVPG) module to acquire sufficient semantic information for action proposal classification. The module first adopts a proposal graph construction strategy to select valuable neighbor proposals for each proposal and constructs them into an action proposal graph. Then, it applies a temporal graph convolution network and a visual graph convolution network in parallel on the graph to improve proposal feature quality by obtaining action information from neighbors. In the temporal graph convolution network, we design a novel temporal graph convolution operation that embeds temporal position relation information into proposal features and extracts the information from other proposals by temporal position relations. Based on the TVPG module, we construct an action proposal classification model named the temporal-visual proposal graph network (TVPGN) and perform extensive experiments on two benchmarks. The results show that TVPGN achieves competitive performance on both datasets.",

keywords = "Action proposal classification, Action proposal graph, Graph convolution, Temporal action detection",

author = "Gan, {Ming Gang} and Yan Zhang and Shaowen Su",

note = "Publisher Copyright: {\textcopyright} 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.",

year = "2023",

month = nov,

doi = "10.1007/s10489-023-04947-0",

language = "English",

volume = "53",

pages = "26008--26026",

journal = "Applied Intelligence",

issn = "0924-669X",

publisher = "Springer Netherlands",

number = "21",

}

TY - JOUR

T1 - Temporal-visual proposal graph network for temporal action detection

AU - Gan, Ming Gang

AU - Zhang, Yan

AU - Su, Shaowen

PY - 2023/11

Y1 - 2023/11

N2 - Temporal action detection is usually divided into two stages: temporal action proposal generation and proposal classification. Most methods consider the proposal classification stage as an action recognition task. However, compared with trimmed videos, proposals generally contain part of the ground-truth action, lacking enough semantic information to predict their categories precisely. In this paper, we propose a novel temporal-visual proposal graph (TVPG) module to acquire sufficient semantic information for action proposal classification. The module first adopts a proposal graph construction strategy to select valuable neighbor proposals for each proposal and constructs them into an action proposal graph. Then, it applies a temporal graph convolution network and a visual graph convolution network in parallel on the graph to improve proposal feature quality by obtaining action information from neighbors. In the temporal graph convolution network, we design a novel temporal graph convolution operation that embeds temporal position relation information into proposal features and extracts the information from other proposals by temporal position relations. Based on the TVPG module, we construct an action proposal classification model named the temporal-visual proposal graph network (TVPGN) and perform extensive experiments on two benchmarks. The results show that TVPGN achieves competitive performance on both datasets.

AB - Temporal action detection is usually divided into two stages: temporal action proposal generation and proposal classification. Most methods consider the proposal classification stage as an action recognition task. However, compared with trimmed videos, proposals generally contain part of the ground-truth action, lacking enough semantic information to predict their categories precisely. In this paper, we propose a novel temporal-visual proposal graph (TVPG) module to acquire sufficient semantic information for action proposal classification. The module first adopts a proposal graph construction strategy to select valuable neighbor proposals for each proposal and constructs them into an action proposal graph. Then, it applies a temporal graph convolution network and a visual graph convolution network in parallel on the graph to improve proposal feature quality by obtaining action information from neighbors. In the temporal graph convolution network, we design a novel temporal graph convolution operation that embeds temporal position relation information into proposal features and extracts the information from other proposals by temporal position relations. Based on the TVPG module, we construct an action proposal classification model named the temporal-visual proposal graph network (TVPGN) and perform extensive experiments on two benchmarks. The results show that TVPGN achieves competitive performance on both datasets.

KW - Action proposal classification

KW - Action proposal graph

KW - Graph convolution

KW - Temporal action detection

UR - http://www.scopus.com/inward/record.url?scp=85168126193&partnerID=8YFLogxK

U2 - 10.1007/s10489-023-04947-0

DO - 10.1007/s10489-023-04947-0

M3 - Article

AN - SCOPUS:85168126193

SN - 0924-669X

VL - 53

SP - 26008

EP - 26026

JO - Applied Intelligence

JF - Applied Intelligence

IS - 21

ER -

Temporal-visual proposal graph network for temporal action detection

摘要

访问文件

其它文件与链接

指纹

引用此