TY - GEN
T1 - Exploiting images for video recognition with hierarchical generative adversarial networks
AU - Yu, Feiwu
AU - Wu, Xinxiao
AU - Sun, Yuchao
AU - Duan, Lixin
N1 - Publisher Copyright:
© 2018 International Joint Conferences on Artificial Intelligence. All right reserved.
PY - 2018
Y1 - 2018
N2 - Existing deep learning methods of video recognition usually require a large number of labeled videos for training. But for a new task, videos are often unlabeled and it is also time-consuming and labor-intensive to annotate them. Instead of human annotation, we try to make use of existing fully labeled images to help recognize those videos. However, due to the problem of domain shifts and heterogeneous feature representations, the performance of classifiers trained on images may be dramatically degraded for video recognition tasks. In this paper, we propose a novel method, called Hierarchical Generative Adversarial Networks (HiGAN), to enhance recognition in videos (i.e., target domain) by transferring knowledge from images (i.e., source domain). The HiGAN model consists of a low-level conditional GAN and a high-level conditional GAN. By taking advantage of these two-level adversarial learning, our method is capable of learning a domain-invariant feature representation of source images and target videos. Comprehensive experiments on two challenging video recognition datasets (i.e. UCF101 and HMDB51) demonstrate the effectiveness of the proposed method when compared with the existing state-of-the-art domain adaptation methods.
AB - Existing deep learning methods of video recognition usually require a large number of labeled videos for training. But for a new task, videos are often unlabeled and it is also time-consuming and labor-intensive to annotate them. Instead of human annotation, we try to make use of existing fully labeled images to help recognize those videos. However, due to the problem of domain shifts and heterogeneous feature representations, the performance of classifiers trained on images may be dramatically degraded for video recognition tasks. In this paper, we propose a novel method, called Hierarchical Generative Adversarial Networks (HiGAN), to enhance recognition in videos (i.e., target domain) by transferring knowledge from images (i.e., source domain). The HiGAN model consists of a low-level conditional GAN and a high-level conditional GAN. By taking advantage of these two-level adversarial learning, our method is capable of learning a domain-invariant feature representation of source images and target videos. Comprehensive experiments on two challenging video recognition datasets (i.e. UCF101 and HMDB51) demonstrate the effectiveness of the proposed method when compared with the existing state-of-the-art domain adaptation methods.
UR - http://www.scopus.com/inward/record.url?scp=85055701629&partnerID=8YFLogxK
U2 - 10.24963/ijcai.2018/154
DO - 10.24963/ijcai.2018/154
M3 - Conference contribution
AN - SCOPUS:85055701629
T3 - IJCAI International Joint Conference on Artificial Intelligence
SP - 1107
EP - 1113
BT - Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018
A2 - Lang, Jerome
PB - International Joint Conferences on Artificial Intelligence
T2 - 27th International Joint Conference on Artificial Intelligence, IJCAI 2018
Y2 - 13 July 2018 through 19 July 2018
ER -