Improving accuracy of temporal action detection by deep hybrid convolutional network

Ming Gang Gan, Yan Zhang*

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Temporal action detection, a fundamental yet challenging task in understanding human actions, is usually divided into two stages: temporal action proposal generation and proposal classification. Classifying action proposals is always considered an action recognition task and receives little attention. However, compared with action classification, classifying action proposals has more large intra-class variations and subtle inter-class differences, making it more difficult to classify accurately. In this paper, we propose a novel end-to-end framework called Deep Hybrid Convolutional Network (DHCNet) to classify action proposals and achieve high-performance temporal action detection. DHCNet improves temporal action detection performance from three aspects. First, DHCNet utilizes Subnet I to effectively model the temporal structure of proposals and generate discriminative proposal features. Second, the Subnet II of DHCNet exploits Graph Convolution (GConv) to acquire information from other proposals and obtains much semantic information to enhance the proposal feature. Third, DHCNet adopts a coarse-to-fine cascaded classification, where the influence of large intra-class variations and subtle inter-class differences are reduced significantly at different granularities. Besides, we design an iterative boundary regression method based on closed-loop feedback to refine the temporal boundaries of proposals. Extensive experiments demonstrate the effectiveness of our approach. Furthermore, DHCNet achieves the state-of-the-art performance on the THUMOS’14 dataset(59.9% on mAP@0.5).

源语言英语
页(从-至)16127-16149
页数23
期刊Multimedia Tools and Applications
82
11
DOI
出版状态已出版 - 5月 2023

指纹

探究 'Improving accuracy of temporal action detection by deep hybrid convolutional network' 的科研主题。它们共同构成独一无二的指纹。

引用此