AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization

Xiao Yu Zhang, Changsheng Li, Haichao Shi*, Xiaobin Zhu, Peng Li, Jing Dong

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

32 引用 (Scopus)

摘要

The point process is a solid framework to model sequential data, such as videos, by exploring the underlying relevance. As a challenging problem for high-level video understanding, weakly supervised action recognition and localization in untrimmed videos have attracted intensive research attention. Knowledge transfer by leveraging the publicly available trimmed videos as external guidance is a promising attempt to make up for the coarse-grained video-level annotation and improve the generalization performance. However, unconstrained knowledge transfer may bring about irrelevant noise and jeopardize the learning model. This article proposes a novel adaptability decomposing encoder-decoder network to transfer reliable knowledge between the trimmed and untrimmed videos for action recognition and localization by bidirectional point process modeling, given only video-level annotations. By decomposing the original features into the domain-adaptable and domain-specific ones based on their adaptability, trimmed-untrimmed knowledge transfer can be safely confined within a more coherent subspace. An encoder-decoder-based structure is carefully designed and jointly optimized to facilitate effective action classification and temporal localization. Extensive experiments are conducted on two benchmark data sets (i.e., THUMOS14 and ActivityNet1.3), and the experimental results clearly corroborate the efficacy of our method.

源语言英语
页(从-至)1852-1863
页数12
期刊IEEE Transactions on Neural Networks and Learning Systems
34
4
DOI
出版状态已出版 - 1 4月 2023

指纹

探究 'AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization' 的科研主题。它们共同构成独一无二的指纹。

引用此