Tackling confusion among actions for action segmentation with adaptive margin and energy-driven refinement

Zhichao Ma, Kan Li*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Video action segmentation is a crucial task in evaluating the ability to understand human activities. Previous works on this task mainly focus on capturing complex temporal structures and fail to consider the feature ambiguity among similar actions and the biased training sets, thus they are easy to confuse some actions. In this paper, we propose a novel action segmentation framework, called DeConfuNet, to solve the above issue. First, we design a discriminative enhancement module (DEM) trained by an adaptive margin-guided discriminative feature learning which adjusts the margin adaptively to increase the feature distinguishability among similar actions, and whose multi-stage reasoning and adaptive feature fusion structures provide structural advantages for distinguishing similar actions. Second, we propose an equalizing influence module (EIM) that can overcome the impact of biased training sets by balancing the influence of training samples under a coefficient-adaptive loss function. Third, an energy and context-driven refinement module (ECRM) further alleviates the impact of the unbalanced influence of training samples by fusing and refining the inference of DEM and EIM, which utilizes the phased prediction including context and energy clues to assimilate untrustworthy segments, alleviating over-segmentation hugely. Extensive experiments show the effectiveness of each proposed technique, they verify that the DEM and EIM are complementary in reasoning and cooperate to overcome the confusion issue, and our approach achieves significant improvement and state-of-the-art performance of accuracy, edit score, and F1 score on the challenging 50Salads, GTEA, and Breakfast benchmarks.

Original languageEnglish
Article number21
JournalMachine Vision and Applications
Volume35
Issue number2
DOIs
Publication statusPublished - Mar 2024

Keywords

  • Action assimilation operator
  • Action segmentation
  • Adaptive margin-guided discriminative feature learning
  • Coefficient-adaptive loss function
  • Energy and context-driven refinement module

Fingerprint

Dive into the research topics of 'Tackling confusion among actions for action segmentation with adaptive margin and energy-driven refinement'. Together they form a unique fingerprint.

Cite this