Weakly Supervised Action Recognition and Localization Using Web Images

Cuiwei Liu*, Xinxiao Wu, Yunde Jia

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper addresses the problem of joint recognition and localization of actions in videos. We develop a novel Transfer Latent Support Vector Machine (TLSVM) by using Web images and weakly annotated training videos. In order to alleviate the laborious and timeconsuming manual annotations of action locations, the model takes training videos which are only annotated with action labels as input. Due to the non-available ground-truth of action locations in videos, the locations are treated as latent variables in our method and are inferred during both training and testing phrases. For the purpose of improving the localization accuracy with some prior information of action locations, we collect a number ofWeb images which are annotated with both action labels and action locations to learn a discriminative model by enforcing the local similarities between videos and Web images. A structural transformation based on randomized clustering forest is used to map Web images to videos for handling the heterogeneous features of Web images and videos. Experiments on two publicly available action datasets demonstrate that the proposed model is effective for both action localization and action recognition.

Original languageEnglish
Title of host publicationComputer Vision - ACCV 2014 - 12th Asian Conference on Computer Vision, Revised Selected Papers
EditorsDaniel Cremers, Hideo Saito, Ian Reid, Ming-Hsuan Yang
PublisherSpringer Verlag
Pages642-657
Number of pages16
ISBN (Electronic)9783319168135
DOIs
Publication statusPublished - 2015
Event12th Asian Conference on Computer Vision, ACCV 2014 - Singapore, Singapore
Duration: 1 Nov 20145 Nov 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9007
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference12th Asian Conference on Computer Vision, ACCV 2014
Country/TerritorySingapore
CitySingapore
Period1/11/145/11/14

Fingerprint

Dive into the research topics of 'Weakly Supervised Action Recognition and Localization Using Web Images'. Together they form a unique fingerprint.

Cite this