Recognizing key segments of videos for video annotation by learning from web image sets

Hao Song, Xinxiao Wu*, Wei Liang, Yunde Jia

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

6 引用 (Scopus)

摘要

In this paper, we propose an approach of inferring the labels of unlabeled consumer videos and at the same time recognizing the key segments of the videos by learning from Web image sets for video annotation. The key segments of the videos are automatically recognized by transferring the knowledge learned from related Web image sets to the videos. We introduce an adaptive latent structural SVM method to adapt the pre-learned classifiers using Web image sets to an optimal target classifier, where the locations of the key segments are modeled as latent variables because the ground-truth of key segments are not available. We utilize a limited number of labeled videos and abundant labeled Web images for training annotation models, which significantly alleviates the time-consuming and labor-expensive collection of a large number of labeled training videos. Experiment on the two challenge datasets Columbia’s Consumer Video (CCV) and TRECVID 2014 Multimedia Event Detection (MED2014) shows our method performs better than state-of-art methods.

源语言英语
页(从-至)6111-6126
页数16
期刊Multimedia Tools and Applications
76
5
DOI
出版状态已出版 - 1 3月 2017

指纹

探究 'Recognizing key segments of videos for video annotation by learning from web image sets' 的科研主题。它们共同构成独一无二的指纹。

引用此