Deep neural network based unsupervised video representation

Xinxiao Wu, Kun Wu

Research output: Contribution to journalReview articlepeer-review

Abstract

Most video representation methods are supervised in the field of computer vision, requiring large amounts of labeled training video sets which is expensive to scale up to rapidly growing data. To solve this problem, this paper proposes an unsupervised video representation method using deep convolutional neural network. The improved dense trajectory (iDT) is utilized to extract the video blocks which alternately train the convolutional neural network and clusters. The deep convolutional neural network model is trained by iteratively algorithm to get the unsupervised video representations. The proposed model is applied to extract features in HMDB 51 and CCV datasets for tasks of motion recognition and event detection respectively. In the experiments, a 62.6% mean accuracy and a 43.6% mean average prevision (mAP) are obtained respectively which proves the effectiveness of the proposed method.

Original languageEnglish
Pages (from-to)8-12
Number of pages5
JournalBeijing Jiaotong Daxue Xuebao/Journal of Beijing Jiaotong University
Volume41
Issue number6
DOIs
Publication statusPublished - 1 Dec 2017

Keywords

  • Convolution neural networks
  • Unsupervised learning
  • Video representation

Fingerprint

Dive into the research topics of 'Deep neural network based unsupervised video representation'. Together they form a unique fingerprint.

Cite this