LucIE: Language-guided local image editing for fashion images

Huanglu Wen, Shaodi You, Ying Fu

Research output: Contribution to journalArticlepeer-review

Abstract

Language-guided fashion image editing is challenging, as fashion image editing is local and requires high precision, while natural language cannot provide precise visual information for guidance. In this paper, we propose LucIE, a novel unsupervised language-guided local image editing method for fashion images. LucIE adopts and modifies recent text-to-image synthesis network, DF-GAN, as its backbone. However, the synthesis backbone often changes the global structure of the input image, making local image editing impractical. To increase structural consistency between input and edited images, we propose Content-Preserving Fusion Module (CPFM). Different from existing fusion modules, CPFM prevents iterative refinement on visual feature maps and accumulates additive modifications on RGB maps. LucIE achieves local image editing explicitly with language-guided image segmentation and mask-guided image blending while only using image and text pairs. Results on the DeepFashion dataset shows that LucIE achieves state-of-the-art results. Compared with previous methods, images generated by LucIE also exhibit fewer artifacts. We provide visualizations and perform ablation studies to validate LucIE and the CPFM. We also demonstrate and analyze limitations of LucIE, to provide a better understanding of LucIE.

Original languageEnglish
Pages (from-to)179-194
Number of pages16
JournalComputational Visual Media
Volume11
Issue number1
DOIs
Publication statusPublished - 2025
Externally publishedYes

Keywords

  • content preservation
  • deep learning
  • fashion images
  • language-guided image editing
  • local image editing

Fingerprint

Dive into the research topics of 'LucIE: Language-guided local image editing for fashion images'. Together they form a unique fingerprint.

Cite this