Multimedia event detection via deep spatial-temporal neural networks

Jingyi Hou, Xinxiao Wu, Feiwu Yu, Yunde Jia

科研成果: 书/报告/会议事项章节会议稿件同行评审

6 引用 (Scopus)

摘要

This paper proposes a novel method using deep spatial-temporal neural networks based on deep Convolutional Neural Network (CNN) for multimedia event detection. To sufficiently take advantage of the motion and appearance information of events from videos, our networks contain two branches: a temporal neural network and a spatial neural network. The temporal neural network captures motion information by Recurrent Neural Networks with the mutation of gated recurrent unit. The spatial neural network catches object information by using the deep CNN, to encode the CNN features as a bag of semantics with more discriminative representations. Both the temporal and spatial features are beneficial for event detection in a fully coupled way. Finally, we employ the generalized multiple kernel learning method to effectively fuse these two types of heterogeneous and complementary features for action recognition. Experiments on TRECVID MEDTest 14 dataset show that our method achieves better performance than the state of the art.

源语言英语
主期刊名2016 IEEE International Conference on Multimedia and Expo, ICME 2016
出版商IEEE Computer Society
ISBN(电子版)9781467372589
DOI
出版状态已出版 - 25 8月 2016
活动2016 IEEE International Conference on Multimedia and Expo, ICME 2016 - Seattle, 美国
期限: 11 7月 201615 7月 2016

出版系列

姓名Proceedings - IEEE International Conference on Multimedia and Expo
2016-August
ISSN(印刷版)1945-7871
ISSN(电子版)1945-788X

会议

会议2016 IEEE International Conference on Multimedia and Expo, ICME 2016
国家/地区美国
Seattle
时期11/07/1615/07/16

指纹

探究 'Multimedia event detection via deep spatial-temporal neural networks' 的科研主题。它们共同构成独一无二的指纹。

引用此