Adaptive Image-to-Video Scene Graph Generation via Knowledge Reasoning and Adversarial Learning

Jin Chen, Xiaofeng Ji, Xinxiao Wu*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Scene graph in a video conveys a wealth of information about objects and their relationships in the scene, thus benefiting many downstream tasks such as video captioning and visual question answering. Existing methods of scene graph generation require large-scale training videos annotated with objects and relationships in each frame to learn a powerful model. However, such comprehensive annotation is time-consuming and labor-intensive. On the other hand, it is much easier and less cost to annotate images with scene graphs, so we investigate leveraging annotated images to facilitate training a scene graph generation model for unannotated videos, namely image-to-video scene graph generation. This task presents two challenges: 1) infer unseen dynamic relationships in videos from static relationships in images due to the absence of motion information in images; 2) adapt objects and static relationships from images to video frames due to the domain shift between them. To address the first challenge, we exploit external commonsense knowledge to infer the unseen dynamic relationship from the temporal evolution of static relationships. We tackle the second challenge by hierarchical adversarial learning to reduce the data distribution discrepancy between images and video frames. Extensive experiment results on two benchmark video datasets demonstrate the effectiveness of our method.

Original languageEnglish
Title of host publicationAAAI-22 Technical Tracks 1
PublisherAssociation for the Advancement of Artificial Intelligence
Pages276-284
Number of pages9
ISBN (Electronic)1577358767, 9781577358763
Publication statusPublished - 30 Jun 2022
Event36th AAAI Conference on Artificial Intelligence, AAAI 2022 - Virtual, Online
Duration: 22 Feb 20221 Mar 2022

Publication series

NameProceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022
Volume36

Conference

Conference36th AAAI Conference on Artificial Intelligence, AAAI 2022
CityVirtual, Online
Period22/02/221/03/22

Fingerprint

Dive into the research topics of 'Adaptive Image-to-Video Scene Graph Generation via Knowledge Reasoning and Adversarial Learning'. Together they form a unique fingerprint.

Cite this