Visual Relationship Detection: A Survey

Jun Cheng; Lei Wang; Jiaji Wu; Xiping Hu; Gwanggil Jeon; Dacheng Tao; Mengchu Zhou

doi:10.1109/TCYB.2022.3142013

Visual Relationship Detection: A Survey

Jun Cheng, Lei Wang^*, Jiaji Wu, Xiping Hu, Gwanggil Jeon, Dacheng Tao, Mengchu Zhou

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

15 引用（Scopus）

摘要

Visual relationship detection (VRD) is one newly developed computer vision task, aiming to recognize relations or interactions between objects in an image. It is a further learning task after object recognition, and is important for fully understanding images even the visual world. It has numerous applications, such as image retrieval, machine vision in robotics, visual question answer (VQA), and visual reasoning. However, this problem is difficult since relationships are not definite, and the number of possible relations is much larger than objects. So the complete annotation for visual relationships is much more difficult, making this task hard to learn. Many approaches have been proposed to tackle this problem especially with the development of deep neural networks in recent years. In this survey, we first introduce the background of visual relations. Then, we present categorization and frameworks of deep learning models for visual relationship detection. The high-level applications, benchmark datasets, as well as empirical analysis are also introduced for comprehensive understanding of this task.

源语言	英语
页（从-至）	8453-8466
页数	14
期刊	IEEE Transactions on Cybernetics
卷	52
期	8
DOI	https://doi.org/10.1109/TCYB.2022.3142013
出版状态	已出版 - 1 8月 2022
已对外发布	是

访问文件

10.1109/TCYB.2022.3142013

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{26b7c9ba9f8b43e6a5282ed3b7f7b0a1,

title = "Visual Relationship Detection: A Survey",

abstract = "Visual relationship detection (VRD) is one newly developed computer vision task, aiming to recognize relations or interactions between objects in an image. It is a further learning task after object recognition, and is important for fully understanding images even the visual world. It has numerous applications, such as image retrieval, machine vision in robotics, visual question answer (VQA), and visual reasoning. However, this problem is difficult since relationships are not definite, and the number of possible relations is much larger than objects. So the complete annotation for visual relationships is much more difficult, making this task hard to learn. Many approaches have been proposed to tackle this problem especially with the development of deep neural networks in recent years. In this survey, we first introduce the background of visual relations. Then, we present categorization and frameworks of deep learning models for visual relationship detection. The high-level applications, benchmark datasets, as well as empirical analysis are also introduced for comprehensive understanding of this task.",

keywords = "Deep learning, detection, neural networks, visual relation",

author = "Jun Cheng and Lei Wang and Jiaji Wu and Xiping Hu and Gwanggil Jeon and Dacheng Tao and Mengchu Zhou",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2022",

month = aug,

day = "1",

doi = "10.1109/TCYB.2022.3142013",

language = "English",

volume = "52",

pages = "8453--8466",

journal = "IEEE Transactions on Cybernetics",

issn = "2168-2267",

publisher = "IEEE Advancing Technology for Humanity",

number = "8",

}

TY - JOUR

T1 - Visual Relationship Detection

T2 - A Survey

AU - Cheng, Jun

AU - Wang, Lei

AU - Wu, Jiaji

AU - Hu, Xiping

AU - Jeon, Gwanggil

AU - Tao, Dacheng

AU - Zhou, Mengchu

PY - 2022/8/1

Y1 - 2022/8/1

N2 - Visual relationship detection (VRD) is one newly developed computer vision task, aiming to recognize relations or interactions between objects in an image. It is a further learning task after object recognition, and is important for fully understanding images even the visual world. It has numerous applications, such as image retrieval, machine vision in robotics, visual question answer (VQA), and visual reasoning. However, this problem is difficult since relationships are not definite, and the number of possible relations is much larger than objects. So the complete annotation for visual relationships is much more difficult, making this task hard to learn. Many approaches have been proposed to tackle this problem especially with the development of deep neural networks in recent years. In this survey, we first introduce the background of visual relations. Then, we present categorization and frameworks of deep learning models for visual relationship detection. The high-level applications, benchmark datasets, as well as empirical analysis are also introduced for comprehensive understanding of this task.

AB - Visual relationship detection (VRD) is one newly developed computer vision task, aiming to recognize relations or interactions between objects in an image. It is a further learning task after object recognition, and is important for fully understanding images even the visual world. It has numerous applications, such as image retrieval, machine vision in robotics, visual question answer (VQA), and visual reasoning. However, this problem is difficult since relationships are not definite, and the number of possible relations is much larger than objects. So the complete annotation for visual relationships is much more difficult, making this task hard to learn. Many approaches have been proposed to tackle this problem especially with the development of deep neural networks in recent years. In this survey, we first introduce the background of visual relations. Then, we present categorization and frameworks of deep learning models for visual relationship detection. The high-level applications, benchmark datasets, as well as empirical analysis are also introduced for comprehensive understanding of this task.

KW - Deep learning

KW - detection

KW - neural networks

KW - visual relation

UR - http://www.scopus.com/inward/record.url?scp=85123781319&partnerID=8YFLogxK

U2 - 10.1109/TCYB.2022.3142013

DO - 10.1109/TCYB.2022.3142013

M3 - Article

C2 - 35077387

AN - SCOPUS:85123781319

SN - 2168-2267

VL - 52

SP - 8453

EP - 8466

JO - IEEE Transactions on Cybernetics

JF - IEEE Transactions on Cybernetics

IS - 8

ER -

Visual Relationship Detection: A Survey

摘要

访问文件

其它文件与链接

指纹

引用此