Multimedia event detection via deep spatial-temporal neural networks

Jingyi Hou, Xinxiao Wu, Feiwu Yu, Yunde Jia

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Citations (Scopus)

Abstract

This paper proposes a novel method using deep spatial-temporal neural networks based on deep Convolutional Neural Network (CNN) for multimedia event detection. To sufficiently take advantage of the motion and appearance information of events from videos, our networks contain two branches: a temporal neural network and a spatial neural network. The temporal neural network captures motion information by Recurrent Neural Networks with the mutation of gated recurrent unit. The spatial neural network catches object information by using the deep CNN, to encode the CNN features as a bag of semantics with more discriminative representations. Both the temporal and spatial features are beneficial for event detection in a fully coupled way. Finally, we employ the generalized multiple kernel learning method to effectively fuse these two types of heterogeneous and complementary features for action recognition. Experiments on TRECVID MEDTest 14 dataset show that our method achieves better performance than the state of the art.

Original languageEnglish
Title of host publication2016 IEEE International Conference on Multimedia and Expo, ICME 2016
PublisherIEEE Computer Society
ISBN (Electronic)9781467372589
DOIs
Publication statusPublished - 25 Aug 2016
Event2016 IEEE International Conference on Multimedia and Expo, ICME 2016 - Seattle, United States
Duration: 11 Jul 201615 Jul 2016

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
Volume2016-August
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2016 IEEE International Conference on Multimedia and Expo, ICME 2016
Country/TerritoryUnited States
CitySeattle
Period11/07/1615/07/16

Keywords

  • multimedia event detection
  • recurrent neural networks
  • spatial-temporal networks

Fingerprint

Dive into the research topics of 'Multimedia event detection via deep spatial-temporal neural networks'. Together they form a unique fingerprint.

Cite this