Multi-View Visual Relationship Detection with Estimated Depth Map

Xiaozhou Liu, Ming Gang Gan*, Yuxuan He

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

The abundant visual information contained in multi-view images is widely used in computer vision tasks. Existing visual relationship detection frameworks have extended the feature vector to improve model performance. However, single view information can not fully reveal the visual relationships in complex visual scenes. To solve this problem and explore the multi-view information in a visual relationship detection (VRD) model, a novel multi-view VRD framework based on a monocular RGB image and an estimated depth map is proposed. The contributions of this paper are threefold. First, we construct a novel multi-view framework which fuses information of different views extracted from estimated RGB-D images. Second, a multi-view image generation method is proposed to transfer flat visual space to 3D multi-view space. Third, we redesign the visual relationship balanced classifier which can process multi-view feature vectors simultaneously. Detailed experiments were conducted on two datasets to demonstrate the effectiveness of the multi-view VRD framework. The experimental results showed that the multi-view VRD framework resulted in state-of-the-art zero-shot learning performance in specific depth conditions.

Original languageEnglish
Article number4674
JournalApplied Sciences (Switzerland)
Volume12
Issue number9
DOIs
Publication statusPublished - 1 May 2022

Keywords

  • RGB-D image
  • computer vision
  • depth map
  • multi view
  • visual relationship detection

Fingerprint

Dive into the research topics of 'Multi-View Visual Relationship Detection with Estimated Depth Map'. Together they form a unique fingerprint.

Cite this