Cross-domain structural model for video event annotation via web images

Han Wang, Xiabi Liu*, Xinxiao Wu, Yunde Jia

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Annotating events in uncontrolled videos is a challenging task. Most of the previous work focuses on obtaining concepts from numerous labeled videos. But it is extremely time consuming and labor expensive to collect a large amount of required labeled videos for modeling events under various circumstances. In this paper, we try to learn models for video event annotation by leveraging abundant Web images which contains a rich source of information with many events taken under various conditions and roughly annotated as well. Our method is based on a new discriminative structural model called Cross-Domain Structural Model (CDSM) to transfer knowledge from Web images (source domain) to consumer videos (target domain), by jointly modeling the interaction between videos and images. Specifically, under this framework we build a common feature subspace to deal with the feature distribution mismatching between the video domain and the image domain. Further, we propose to use weak semantic attributes to describe events, which can be obtained with no or little labor. Experimental results on challenging video datasets demonstrate the effectiveness of our transfer learning method.

Original languageEnglish
Pages (from-to)10439-10456
Number of pages18
JournalMultimedia Tools and Applications
Volume74
Issue number23
DOIs
Publication statusPublished - Dec 2015

Keywords

  • Knowledge transfer
  • Video analysis
  • Video annotation

Fingerprint

Dive into the research topics of 'Cross-domain structural model for video event annotation via web images'. Together they form a unique fingerprint.

Cite this