Image Dense Captioning of Irregular Regions Based on Visual Saliency

Xiaosheng Wen*, Ping Jian*

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Traditional Dense Captioning intends to describe local details of image with natural language. It usually uses target detection first and then describes the contents in the detected bounding box, which will make the description content rich. But captioning based on target detection often lacks the attention to the association between objects and the environment, or between the objects. And for now, there is no dense captioning method has the ability to deal with irregular areas. To solve these problems, we propose a visual-saliency based region division method. It focuses more on areas than just on objects. Based on the division, the local description of the irregular region is carried out. For each area, we combine the image with the target area to generate features, which are put into the caption model. We used the Visual Genome dataset for training and testing. Through experiments, our model is comparable to the baseline under the traditional bounding box. And the description of irregular region generated by our method is equally good. Our model performs well in image retrieval experiments and has less information redundancy. In the application, we support to manually select the region of interest on the image for description, for assist in expanding the dataset.

源语言英语
主期刊名Proceedings - 2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms, PRMVIA 2023
出版商Institute of Electrical and Electronics Engineers Inc.
8-14
页数7
ISBN(电子版)9798350346596
DOI
出版状态已出版 - 2023
活动2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms, PRMVIA 2023 - Beihai, 中国
期限: 24 3月 202326 3月 2023

出版系列

姓名Proceedings - 2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms, PRMVIA 2023

会议

会议2023 International Conference on Pattern Recognition, Machine Vision and Intelligent Algorithms, PRMVIA 2023
国家/地区中国
Beihai
时期24/03/2326/03/23

指纹

探究 'Image Dense Captioning of Irregular Regions Based on Visual Saliency' 的科研主题。它们共同构成独一无二的指纹。

引用此