TY - GEN
T1 - Video annotation by incremental learning from grouped heterogeneous sources
AU - Wang, Han
AU - Song, Hao
AU - Wu, Xinxiao
AU - Jia, Yunde
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - Transfer learning has shown promising results in leveraging loosely labeledWeb images (source domain) to learn a robust classifier for the unlabeled consumer videos (target domain). Existing transfer learning methods typically apply source domain data to learn a fixed model for predicting target domain data once and for all, ignoring rapidly updating Web data and continuously changes of users requirements. We propose an incremental transfer learning framework, in which heterogeneous knowledge are integrated and incrementally added to update the target classifier during learning process. Under the framework, images (image source domain) queried from Web image search engine and videos (video source domain) from existing action datasets are adopted to provide static information and motion information of the target video, respectively. For the image source domain, images are partitioned into several groups according to their semantic information. And for the video source domain, videos are divided in the same way. Unlike traditional methods which measure relevance between the source group and the whole target domain videos, the group weights in this paper are treated as latent variables for each target domain video and learned automatically according to the probability distribution difference between the individual source group and target domain videos. Experimental results on the two challenging video datasets (i.e., CCV and Kodak) demonstrate the effectiveness of our proposed method.
AB - Transfer learning has shown promising results in leveraging loosely labeledWeb images (source domain) to learn a robust classifier for the unlabeled consumer videos (target domain). Existing transfer learning methods typically apply source domain data to learn a fixed model for predicting target domain data once and for all, ignoring rapidly updating Web data and continuously changes of users requirements. We propose an incremental transfer learning framework, in which heterogeneous knowledge are integrated and incrementally added to update the target classifier during learning process. Under the framework, images (image source domain) queried from Web image search engine and videos (video source domain) from existing action datasets are adopted to provide static information and motion information of the target video, respectively. For the image source domain, images are partitioned into several groups according to their semantic information. And for the video source domain, videos are divided in the same way. Unlike traditional methods which measure relevance between the source group and the whole target domain videos, the group weights in this paper are treated as latent variables for each target domain video and learned automatically according to the probability distribution difference between the individual source group and target domain videos. Experimental results on the two challenging video datasets (i.e., CCV and Kodak) demonstrate the effectiveness of our proposed method.
UR - http://www.scopus.com/inward/record.url?scp=84929613078&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-16814-2_32
DO - 10.1007/978-3-319-16814-2_32
M3 - Conference contribution
AN - SCOPUS:84929613078
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 493
EP - 507
BT - Computer Vision - ACCV 2014 - 12th Asian Conference on Computer Vision, Revised Selected Papers
A2 - Cremers, Daniel
A2 - Saito, Hideo
A2 - Reid, Ian
A2 - Yang, Ming-Hsuan
PB - Springer Verlag
T2 - 12th Asian Conference on Computer Vision, ACCV 2014
Y2 - 1 November 2014 through 5 November 2014
ER -