Adaptive depth-aware visual relationship detection

Ming Gang Gan, Yuxuan He*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)

Abstract

Visual relationship detection aims at detecting the interaction between objects from flat image, where visual appearance and spatial relationship between different objects are two key factors for detection. However, most existing methods usually extract 2D information of object from flat images, which lacks depth information compared to actual 3D space. To obtain and utilize depth information for visual relationship detection, we construct Depth VRDs dataset as an extension of the VRD dataset and propose adaptive depth-aware visual relationship detection network(ADVRD). In terms of visual appearance, we propose depth-aware visual fusion module to use additional depth visual information to guide RGB visual information where needs to be strengthened. In terms of spatial relationship, to generate a more accurate depth representation when locating object depth spatial position, we propose adaptive depth spatial location method which uses regional information variance to measure information relevance in each small region in object bounding box. Experiment results show that depth information can significantly improve the performance of our network on visual relationship detection tasks, especially for zero shots.

Original languageEnglish
Article number108786
JournalKnowledge-Based Systems
Volume247
DOIs
Publication statusPublished - 8 Jul 2022

Keywords

  • Adaptive depth spatial location
  • Depth-aware visual fusion
  • Estimated depth maps
  • Visual relationship detection

Fingerprint

Dive into the research topics of 'Adaptive depth-aware visual relationship detection'. Together they form a unique fingerprint.

Cite this