ADVANCING CONTROLLABLE DIFFUSION MODEL FOR FEW-SHOT OBJECT DETECTION IN OPTICAL REMOTE SENSING IMAGERY

Tong Zhang; Yin Zhuang; Xinyi Zhang; Guanqun Wang; He Chen; Fukun Bi

doi:10.1109/IGARSS53475.2024.10642625

ADVANCING CONTROLLABLE DIFFUSION MODEL FOR FEW-SHOT OBJECT DETECTION IN OPTICAL REMOTE SENSING IMAGERY

Tong Zhang, Yin Zhuang^*, Xinyi Zhang, Guanqun Wang, He Chen, Fukun Bi

^*此作品的通讯作者

信息与电子学院

科研成果: 会议稿件 › 论文 › 同行评审

摘要

Few-shot object detection (FSOD) from optical remote sensing imagery has to detect rare objects given only a few annotated bounding boxes. The limited training data is hard to represent the data distribution of realistic remote sensing scenes, restricting the performance of FSOD. Recently, learning conditional controls for text-to-image diffusion model has achieved great progress, which is capable of precisely generating the controllable yet imaginational images by text prompt and spatially localized input conditions. Accordingly, in this work, we aim to explore the potential of diffusion model and propose a solution for few-shot object detection by controllable data generation. Firstly, draw upon a few annotated objects, their bounding boxes and categories are respectively used as the spatial conditions and text prompts, then employ them into large text-to-image diffusion models for controlled image generation. Secondly, based the generated images, in order to adapt to the scale and orientation variances of remote sensing objects, a data transformation is devised for boosting the robustness of model training. Finally, some experiments were conducted on public remote sensing dataset DIOR, and the results proved its effectiveness.

源语言	英语
页	7600-7603
页数	4
DOI	https://doi.org/10.1109/IGARSS53475.2024.10642625
出版状态	已出版 - 2024
活动	2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024 - Athens, 希腊期限: 7 7月 2024 → 12 7月 2024

会议

会议	2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024
国家/地区	希腊
市	Athens
时期	7/07/24 → 12/07/24

访问文件

10.1109/IGARSS53475.2024.10642625

其它文件与链接

链接到 Scopus 的出版物

引用此

Zhang, T., Zhuang, Y., Zhang, X., Wang, G., Chen, H., & Bi, F. (2024). ADVANCING CONTROLLABLE DIFFUSION MODEL FOR FEW-SHOT OBJECT DETECTION IN OPTICAL REMOTE SENSING IMAGERY. 7600-7603. 论文发表于 2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024, Athens, 希腊. https://doi.org/10.1109/IGARSS53475.2024.10642625

@conference{049267cb318e4156881bb9b399861b6b,

title = "ADVANCING CONTROLLABLE DIFFUSION MODEL FOR FEW-SHOT OBJECT DETECTION IN OPTICAL REMOTE SENSING IMAGERY",

abstract = "Few-shot object detection (FSOD) from optical remote sensing imagery has to detect rare objects given only a few annotated bounding boxes. The limited training data is hard to represent the data distribution of realistic remote sensing scenes, restricting the performance of FSOD. Recently, learning conditional controls for text-to-image diffusion model has achieved great progress, which is capable of precisely generating the controllable yet imaginational images by text prompt and spatially localized input conditions. Accordingly, in this work, we aim to explore the potential of diffusion model and propose a solution for few-shot object detection by controllable data generation. Firstly, draw upon a few annotated objects, their bounding boxes and categories are respectively used as the spatial conditions and text prompts, then employ them into large text-to-image diffusion models for controlled image generation. Secondly, based the generated images, in order to adapt to the scale and orientation variances of remote sensing objects, a data transformation is devised for boosting the robustness of model training. Finally, some experiments were conducted on public remote sensing dataset DIOR, and the results proved its effectiveness.",

keywords = "Diffusion model, few-shot object detection, optical remote sensing imagery, stable diffusion",

author = "Tong Zhang and Yin Zhuang and Xinyi Zhang and Guanqun Wang and He Chen and Fukun Bi",

note = "Publisher Copyright: {\textcopyright}2024 IEEE.; 2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024 ; Conference date: 07-07-2024 Through 12-07-2024",

year = "2024",

doi = "10.1109/IGARSS53475.2024.10642625",

language = "English",

pages = "7600--7603",

}

TY - CONF

T1 - ADVANCING CONTROLLABLE DIFFUSION MODEL FOR FEW-SHOT OBJECT DETECTION IN OPTICAL REMOTE SENSING IMAGERY

AU - Zhang, Tong

AU - Zhuang, Yin

AU - Zhang, Xinyi

AU - Wang, Guanqun

AU - Chen, He

AU - Bi, Fukun

PY - 2024

Y1 - 2024

N2 - Few-shot object detection (FSOD) from optical remote sensing imagery has to detect rare objects given only a few annotated bounding boxes. The limited training data is hard to represent the data distribution of realistic remote sensing scenes, restricting the performance of FSOD. Recently, learning conditional controls for text-to-image diffusion model has achieved great progress, which is capable of precisely generating the controllable yet imaginational images by text prompt and spatially localized input conditions. Accordingly, in this work, we aim to explore the potential of diffusion model and propose a solution for few-shot object detection by controllable data generation. Firstly, draw upon a few annotated objects, their bounding boxes and categories are respectively used as the spatial conditions and text prompts, then employ them into large text-to-image diffusion models for controlled image generation. Secondly, based the generated images, in order to adapt to the scale and orientation variances of remote sensing objects, a data transformation is devised for boosting the robustness of model training. Finally, some experiments were conducted on public remote sensing dataset DIOR, and the results proved its effectiveness.

AB - Few-shot object detection (FSOD) from optical remote sensing imagery has to detect rare objects given only a few annotated bounding boxes. The limited training data is hard to represent the data distribution of realistic remote sensing scenes, restricting the performance of FSOD. Recently, learning conditional controls for text-to-image diffusion model has achieved great progress, which is capable of precisely generating the controllable yet imaginational images by text prompt and spatially localized input conditions. Accordingly, in this work, we aim to explore the potential of diffusion model and propose a solution for few-shot object detection by controllable data generation. Firstly, draw upon a few annotated objects, their bounding boxes and categories are respectively used as the spatial conditions and text prompts, then employ them into large text-to-image diffusion models for controlled image generation. Secondly, based the generated images, in order to adapt to the scale and orientation variances of remote sensing objects, a data transformation is devised for boosting the robustness of model training. Finally, some experiments were conducted on public remote sensing dataset DIOR, and the results proved its effectiveness.

KW - Diffusion model

KW - few-shot object detection

KW - optical remote sensing imagery

KW - stable diffusion

UR - http://www.scopus.com/inward/record.url?scp=85208465427&partnerID=8YFLogxK

U2 - 10.1109/IGARSS53475.2024.10642625

DO - 10.1109/IGARSS53475.2024.10642625

M3 - Paper

AN - SCOPUS:85208465427

SP - 7600

EP - 7603

T2 - 2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024

Y2 - 7 July 2024 through 12 July 2024

ER -

ADVANCING CONTROLLABLE DIFFUSION MODEL FOR FEW-SHOT OBJECT DETECTION IN OPTICAL REMOTE SENSING IMAGERY

摘要

会议

访问文件

其它文件与链接

指纹

引用此