Image Editing based on Diffusion Model for Remote Sensing Image Change Captioning

Miaoxin Cai, He Chen*, Can Li, Shuyu Gan, Liang Chen, Yin Zhuang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Remote Sensing Image Change Captioning (RSICC) is a task that utilizes natural language to describe changes in remote sensing images of the same area captured at different times. However, the significant temporal intervals between multi-temporal images, the infrequency of observable changes, and the limitations on observation locations make it difficult to acquire and annotate a large and diverse dataset for analyzing change in multi-temporal images. The scarcity of labeled data hinders the training of RSICC models, leading to poor generalization. Compared to annotated registered bi-temporal image, single-temporal data is easier to obtain. Therefore, to tackle the issue of poor generalization of RSICC models under limited annotated sample conditions, a text-guided image pairs generation (TGIPG) method is proposed to create synthetic RSICC datasets from single-temporal data and randomly sampled text instructions via a diffusion-based controllable image editing model. This approach generates more valid pairs of multi-temporal samples to address the constraints of limited change information. Specifically, this method utilizes language instructions to introduce change information into the diffusion process, gradually transforming the pre-phase image into the post-phase image. Our experiments on the LEVIR-CC dataset show that synthetic data can significantly enhance the performance of any RSICC model, with a restricted number of training samples, by employing this plug-and-play TGIPG method.

Original languageEnglish
Title of host publicationIEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331515669
DOIs
Publication statusPublished - 2024
Event2nd IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024 - Zhuhai, China
Duration: 22 Nov 202424 Nov 2024

Publication series

NameIEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024

Conference

Conference2nd IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024
Country/TerritoryChina
CityZhuhai
Period22/11/2424/11/24

Keywords

  • Change captioning
  • diffusion
  • image generation
  • remote sensing

Fingerprint

Dive into the research topics of 'Image Editing based on Diffusion Model for Remote Sensing Image Change Captioning'. Together they form a unique fingerprint.

Cite this