Cross-domain structural model for video event annotation via web images

Han Wang, Xiabi Liu*, Xinxiao Wu, Yunde Jia

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Annotating events in uncontrolled videos is a challenging task. Most of the previous work focuses on obtaining concepts from numerous labeled videos. But it is extremely time consuming and labor expensive to collect a large amount of required labeled videos for modeling events under various circumstances. In this paper, we try to learn models for video event annotation by leveraging abundant Web images which contains a rich source of information with many events taken under various conditions and roughly annotated as well. Our method is based on a new discriminative structural model called Cross-Domain Structural Model (CDSM) to transfer knowledge from Web images (source domain) to consumer videos (target domain), by jointly modeling the interaction between videos and images. Specifically, under this framework we build a common feature subspace to deal with the feature distribution mismatching between the video domain and the image domain. Further, we propose to use weak semantic attributes to describe events, which can be obtained with no or little labor. Experimental results on challenging video datasets demonstrate the effectiveness of our transfer learning method.

源语言英语
页(从-至)10439-10456
页数18
期刊Multimedia Tools and Applications
74
23
DOI
出版状态已出版 - 12月 2015

指纹

探究 'Cross-domain structural model for video event annotation via web images' 的科研主题。它们共同构成独一无二的指纹。

引用此