TwinNet: Twin Structured Knowledge Transfer Network for Weakly Supervised Action Localization

Xiao Yu Zhang, Hai Chao Shi*, Chang Sheng Li, Li Xin Duan

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)

Abstract

Action recognition and localization in untrimmed videos is important for many applications and have attracted a lot of attention. Since full supervision with frame-level annotation places an overwhelming burden on manual labeling effort, learning with weak video-level supervision becomes a potential solution. In this paper, we propose a novel weakly supervised framework to recognize actions and locate the corresponding frames in untrimmed videos simultaneously. Considering that there are abundant trimmed videos publicly available and well-segmented with semantic descriptions, the instructive knowledge learned on trimmed videos can be fully leveraged to analyze untrimmed videos. We present an effective knowledge transfer strategy based on inter-class semantic relevance. We also take advantage of the self-attention mechanism to obtain a compact video representation, such that the influence of background frames can be effectively eliminated. A learning architecture is designed with twin networks for trimmed and untrimmed videos, to facilitate transferable self-attentive representation learning. Extensive experiments are conducted on three untrimmed benchmark datasets (i.e., THUMOS14, ActivityNet1.3, and MEXaction2), and the experimental results clearly corroborate the efficacy of our method. It is especially encouraging to see that the proposed weakly supervised method even achieves comparable results to some fully supervised methods.

Original languageEnglish
Pages (from-to)227-246
Number of pages20
JournalMachine Intelligence Research
Volume19
Issue number3
DOIs
Publication statusPublished - Jun 2022

Keywords

  • Knowledge transfer
  • action localization
  • representation learning
  • self-attention mechanism
  • weakly supervised learning

Fingerprint

Dive into the research topics of 'TwinNet: Twin Structured Knowledge Transfer Network for Weakly Supervised Action Localization'. Together they form a unique fingerprint.

Cite this