Adaptive depth-aware visual relationship detection

Ming Gang Gan, Yuxuan He*

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

6 引用 (Scopus)

摘要

Visual relationship detection aims at detecting the interaction between objects from flat image, where visual appearance and spatial relationship between different objects are two key factors for detection. However, most existing methods usually extract 2D information of object from flat images, which lacks depth information compared to actual 3D space. To obtain and utilize depth information for visual relationship detection, we construct Depth VRDs dataset as an extension of the VRD dataset and propose adaptive depth-aware visual relationship detection network(ADVRD). In terms of visual appearance, we propose depth-aware visual fusion module to use additional depth visual information to guide RGB visual information where needs to be strengthened. In terms of spatial relationship, to generate a more accurate depth representation when locating object depth spatial position, we propose adaptive depth spatial location method which uses regional information variance to measure information relevance in each small region in object bounding box. Experiment results show that depth information can significantly improve the performance of our network on visual relationship detection tasks, especially for zero shots.

源语言英语
文章编号108786
期刊Knowledge-Based Systems
247
DOI
出版状态已出版 - 8 7月 2022

指纹

探究 'Adaptive depth-aware visual relationship detection' 的科研主题。它们共同构成独一无二的指纹。

引用此