Improving accuracy of temporal action detection by deep hybrid convolutional network

Ming Gang Gan, Yan Zhang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Temporal action detection, a fundamental yet challenging task in understanding human actions, is usually divided into two stages: temporal action proposal generation and proposal classification. Classifying action proposals is always considered an action recognition task and receives little attention. However, compared with action classification, classifying action proposals has more large intra-class variations and subtle inter-class differences, making it more difficult to classify accurately. In this paper, we propose a novel end-to-end framework called Deep Hybrid Convolutional Network (DHCNet) to classify action proposals and achieve high-performance temporal action detection. DHCNet improves temporal action detection performance from three aspects. First, DHCNet utilizes Subnet I to effectively model the temporal structure of proposals and generate discriminative proposal features. Second, the Subnet II of DHCNet exploits Graph Convolution (GConv) to acquire information from other proposals and obtains much semantic information to enhance the proposal feature. Third, DHCNet adopts a coarse-to-fine cascaded classification, where the influence of large intra-class variations and subtle inter-class differences are reduced significantly at different granularities. Besides, we design an iterative boundary regression method based on closed-loop feedback to refine the temporal boundaries of proposals. Extensive experiments demonstrate the effectiveness of our approach. Furthermore, DHCNet achieves the state-of-the-art performance on the THUMOS’14 dataset(59.9% on mAP@0.5).

Original languageEnglish
Pages (from-to)16127-16149
Number of pages23
JournalMultimedia Tools and Applications
Volume82
Issue number11
DOIs
Publication statusPublished - May 2023

Keywords

  • Boundary regression
  • Proposal classification
  • Temporal action detection
  • Temporal action location
  • Video analysis

Fingerprint

Dive into the research topics of 'Improving accuracy of temporal action detection by deep hybrid convolutional network'. Together they form a unique fingerprint.

Cite this