Weakly-supervised action localization via embedding-modeling iterative optimization

Xiao Yu Zhang*, Haichao Shi, Changsheng Li, Peng Li, Zekun Li, Peng Ren

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

11 引用 (Scopus)

摘要

Action recognition and localization in untrimmed videos in weakly supervised scenario is a challenging problem of great application prospects. Limited by the information available in video-level labels, it is a promising attempt to fully leverage the instructive knowledge learned on trimmed videos to facilitate analysis of untrimmed videos, considering that there are abundant trimmed videos which are publicly available and well segmented with semantic descriptions. In order to enforce effective trimmed-untrimmed augmentation, this paper presents a novel framework of embedding-modeling iterative optimization network, referred to as IONet. In the proposed method, action classification modeling and shared subspace embedding are learned jointly in an iterative way, so that robust cross-domain knowledge transfer is achieved. With a carefully designed two-stage self-attentive representation learning workflow for untrimmed videos, irrelevant backgrounds are eliminated and fine-grained temporal relevance can be robustly explored. Extensive experiments are conducted on two benchmark datasets, i.e., THUMOS14 and ActivityNet1.3, and experimental results clearly corroborate the efficacy of our method. Source code is available on GitHub.

源语言英语
文章编号107831
期刊Pattern Recognition
113
DOI
出版状态已出版 - 5月 2021

指纹

探究 'Weakly-supervised action localization via embedding-modeling iterative optimization' 的科研主题。它们共同构成独一无二的指纹。

引用此