Diffusion-Geo: A Two-Stage Controllable Text-To-Image Generative Model for Remote Sensing Scenarios

Miaoxin Cai, Wei Zhang, Tong Zhang, Yin Zhuang*, He Chen, Liang Chen, Can Li

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Image generation is a crucial task to facilitate intelligent interpretation in remote sensing domain. Expanding dataset size through image generation can enhance model performance of downtown task. However, current generative models in remote sensing are mostly unconditional or guided by simple text, resulting in generated images lacking spatial and semantic constraints. This lack of control can negatively optimize downstream task models. To tackle these challenges, a two-stage controllable text-image generative model called Diffusion-Geo is presented. In the first stage, an extensive image-text generation dataset called RS-Control is created through prompt engineering of multimodal large language models (MLLMs) and manual prompts for existing datasets, incorporates diverse conditional controls with rich spatial and semantic information. Then RS-Control dataset is utilized to train a universal controllable image generative model. The second stage involves efficient tuning the universal model for different task datasets, minimizing fine-tuning costs while preserving diversity and high-quality features. Experiments conducted on the RSICD caption dataset and WHU change detection dataset demonstrate the superiority of Diffusion-Geo over other state-of-the-art models in image generation.

源语言英语
主期刊名IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
7003-7006
页数4
ISBN(电子版)9798350360325
DOI
出版状态已出版 - 2024
活动2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024 - Athens, 希腊
期限: 7 7月 202412 7月 2024

出版系列

姓名International Geoscience and Remote Sensing Symposium (IGARSS)

会议

会议2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024
国家/地区希腊
Athens
时期7/07/2412/07/24

指纹

探究 'Diffusion-Geo: A Two-Stage Controllable Text-To-Image Generative Model for Remote Sensing Scenarios' 的科研主题。它们共同构成独一无二的指纹。

引用此