3D convolutional two-stream network for action recognition in videos

Min Li, Yuezhu Qi, Jian Yang, Yanfang Zhang, Junxing Ren, Hong Du

科研成果: 书/报告/会议事项章节会议稿件同行评审

4 引用 (Scopus)

摘要

In recent years, action recognition based on two-stream networks has developed rapidly. However, most existing methods describe incomplete and distorted video content due to cropped and warped frame or clip-level feature extraction. This paper proposed an approach based on deep learning that preserves the complete contextual relation of temporal human actions in videos. The proposed architecture follows the two-stream network with a novel 3D Convolutional Network (ConvNets) and pyramid pooling layer, to design an end-to-end behavioral feature learning method. The 3D ConvNets extract video-level, spatial-temporal features from two input streams, the RGB images and the corresponding optical flow. The multi-scale pyramid pooling layer fixed the generated feature maps into a unified size regardless of input video size. The final predictions are resulted from a fused softmax scores of two streams, and subject to the weighting factor of each stream. Our experimental results suggest spatial stream slightly higher than the temporal stream, and the performance of the trained model is conditionally optimized. The proposed method is experimented on two challenging video action datasets UCF101 and HMDB51, in which our method achieves the most advanced performance above 96.1% on UCF101 dataset.

源语言英语
主期刊名Proceedings - IEEE 31st International Conference on Tools with Artificial Intelligence, ICTAI 2019
出版商IEEE Computer Society
1697-1701
页数5
ISBN(电子版)9781728137988
DOI
出版状态已出版 - 11月 2019
活动31st IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2019 - Portland, 美国
期限: 4 11月 20196 11月 2019

出版系列

姓名Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI
2019-November
ISSN(印刷版)1082-3409

会议

会议31st IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2019
国家/地区美国
Portland
时期4/11/196/11/19

指纹

探究 '3D convolutional two-stream network for action recognition in videos' 的科研主题。它们共同构成独一无二的指纹。

引用此