TY - JOUR
T1 - Proposal Semantic Relationship Graph Network for Temporal Action Detection
AU - Su, Shaowen
AU - Zhang, Yan
AU - Gan, Minggang
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s)
PY - 2024/12/13
Y1 - 2024/12/13
N2 - Temporal action detection, a critical task in video activity understanding, is typically divided into two stages: proposal generation and classification. However, most existing methods overlook the importance of information transfer among proposals during classification, often treating each proposal in isolation, which hampers accurate label prediction. In this article, we propose a novel method for inferring semantic relationships both within and between action proposals, guiding the fusion of action proposal features accordingly. Building on this approach, we introduce the Proposal Semantic Relationship Graph Network (PSRGN), an end-to-end model that leverages intra-proposal semantic relationship graphs to extract cross-scale temporal context and an inter-proposal semantic relationship graph to incorporate complementary neighboring information, significantly improving proposal feature quality and overall detection performance. This is the first method to apply graph structure learning in temporal action detection, adaptively constructing the inter-proposal semantic graph. Extensive experiments on two datasets demonstrate the effectiveness of our approach, achieving state-of-the-art (SOTA). Code and results are available at http://github.com/Riiick2011/PSRGN.
AB - Temporal action detection, a critical task in video activity understanding, is typically divided into two stages: proposal generation and classification. However, most existing methods overlook the importance of information transfer among proposals during classification, often treating each proposal in isolation, which hampers accurate label prediction. In this article, we propose a novel method for inferring semantic relationships both within and between action proposals, guiding the fusion of action proposal features accordingly. Building on this approach, we introduce the Proposal Semantic Relationship Graph Network (PSRGN), an end-to-end model that leverages intra-proposal semantic relationship graphs to extract cross-scale temporal context and an inter-proposal semantic relationship graph to incorporate complementary neighboring information, significantly improving proposal feature quality and overall detection performance. This is the first method to apply graph structure learning in temporal action detection, adaptively constructing the inter-proposal semantic graph. Extensive experiments on two datasets demonstrate the effectiveness of our approach, achieving state-of-the-art (SOTA). Code and results are available at http://github.com/Riiick2011/PSRGN.
KW - graph convolutional network
KW - graph structure learning
KW - proposal semantic relationship graph
KW - Temporal action detection
UR - http://www.scopus.com/inward/record.url?scp=85217832692&partnerID=8YFLogxK
U2 - 10.1145/3702233
DO - 10.1145/3702233
M3 - Article
AN - SCOPUS:85217832692
SN - 2157-6904
VL - 15
JO - ACM Transactions on Intelligent Systems and Technology
JF - ACM Transactions on Intelligent Systems and Technology
IS - 6
M1 - ART135
ER -