TY - GEN
T1 - Denoised Temporal Relation Network for Temporal Action Segmentation
AU - Ma, Zhichao
AU - Li, Kan
N1 - Publisher Copyright:
© 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
PY - 2024
Y1 - 2024
N2 - Temporal relations among action segments play a crucial role in temporal action segmentation. Existing methods tend to employ the graph neural network to model the temporal relation. However, the performance is unsatisfactory and exhibits serious over-segmentation due to the generated noisy features. To solve the above issues, we present an action segmentation framework, termed a denoised temporal relation network (DTRN). In DTRN, a temporal reasoning module (TRM) models inter-segment temporal relations and conducts feature denoising jointly. Specifically, the TRM conducts an uncertainty-gated reasoning mechanism for noise-immune and utilizes a cross-attention-based structure to combine the informative clues from the discriminative enhance module which is trained under Selective Margin Plasticity (SMP) to ensure informative clues, SMP adjusts the decision boundary adaptively by changing specific margins in real-time. Our framework is demonstrated to be effective and achieves state-of-the-art performance of accuracy, edit score, and F1 score on the challenging 50Salads, GTEA, and Breakfast benchmarks.
AB - Temporal relations among action segments play a crucial role in temporal action segmentation. Existing methods tend to employ the graph neural network to model the temporal relation. However, the performance is unsatisfactory and exhibits serious over-segmentation due to the generated noisy features. To solve the above issues, we present an action segmentation framework, termed a denoised temporal relation network (DTRN). In DTRN, a temporal reasoning module (TRM) models inter-segment temporal relations and conducts feature denoising jointly. Specifically, the TRM conducts an uncertainty-gated reasoning mechanism for noise-immune and utilizes a cross-attention-based structure to combine the informative clues from the discriminative enhance module which is trained under Selective Margin Plasticity (SMP) to ensure informative clues, SMP adjusts the decision boundary adaptively by changing specific margins in real-time. Our framework is demonstrated to be effective and achieves state-of-the-art performance of accuracy, edit score, and F1 score on the challenging 50Salads, GTEA, and Breakfast benchmarks.
KW - Denoised Temporal Relation Network
KW - Selective Margin Plasticity
KW - Temporal Action Segmentation
UR - http://www.scopus.com/inward/record.url?scp=85181776840&partnerID=8YFLogxK
U2 - 10.1007/978-981-99-8537-1_23
DO - 10.1007/978-981-99-8537-1_23
M3 - Conference contribution
AN - SCOPUS:85181776840
SN - 9789819985364
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 282
EP - 294
BT - Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings
A2 - Liu, Qingshan
A2 - Wang, Hanzi
A2 - Ji, Rongrong
A2 - Ma, Zhanyu
A2 - Zheng, Weishi
A2 - Zha, Hongbin
A2 - Chen, Xilin
A2 - Wang, Liang
PB - Springer Science and Business Media Deutschland GmbH
T2 - 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023
Y2 - 13 October 2023 through 15 October 2023
ER -