Multi-Branch Spatial-Temporal Network for Action Recognition

Yingying Wang, Wei Li*, Ran Tao

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

13 引用 (Scopus)

摘要

Human action recognition based on deep-learning methods have received increasing attention and developed rapidly. However, current methods suffer from the confusion caused by convolving over time and space independently, processing shorter sequences, restricted to single temporal scale modeling and so on. The key objective of precisely classifying actions is to capture the appearance and motion throughout entire videos. Based on this purpose, a multi-branch spatial-temporal network (MSTN) is proposed. It consists of a multi-branch deep network and a long-term feature (LTF) layer. Benefits of the proposed MSTN include: (a) the multi-branch spatial-temporal network aims at encoding spatial and temporal information simultaneously, and (b) the LTF layer is used to aggregate the video-level representation with multiple temporal scales. Evaluations on two action datasets and comparison with several state-of-the-art approaches demonstrate the effectiveness of the proposed network.

源语言英语
文章编号8832232
页(从-至)1556-1560
页数5
期刊IEEE Signal Processing Letters
26
10
DOI
出版状态已出版 - 10月 2019

指纹

探究 'Multi-Branch Spatial-Temporal Network for Action Recognition' 的科研主题。它们共同构成独一无二的指纹。

引用此