Transfer Latent SVM for joint recognition and localization of actions in videos

Cuiwei Liu, Xinxiao Wu*, Yunde Jia

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

13 Citations (Scopus)

Abstract

In this paper, we develop a novel transfer latent support vector machine for joint recognition and localization of actions by using Web images and weakly annotated training videos. The model takes training videos which are only annotated with action labels as input for alleviating the laborious and time-consuming manual annotations of action locations. Since the ground-Truth of action locations in videos are not available, the locations are modeled as latent variables in our method and are inferred during both training and testing phrases. For the purpose of improving the localization accuracy with some prior information of action locations, we collect a number of Web images which are annotated with both action labels and action locations to learn a discriminative model by enforcing the local similarities between videos and Web images. A structural transformation based on randomized clustering forest is used to map the Web images to videos for handling the heterogeneous features of Web images and videos. Experiments on two public action datasets demonstrate the effectiveness of the proposed model for both action localization and action recognition.

Original languageEnglish
Article number7299283
Pages (from-to)2596-2608
Number of pages13
JournalIEEE Transactions on Cybernetics
Volume46
Issue number10
DOIs
Publication statusPublished - 15 Oct 2015

Keywords

  • Action localization
  • action recognition
  • transfer latent support vector machine (TLSVM) model.

Fingerprint

Dive into the research topics of 'Transfer Latent SVM for joint recognition and localization of actions in videos'. Together they form a unique fingerprint.

Cite this