Text-to-Image Synthesis with Threshold-Equipped Matching-Aware GAN

  • Jun Shang
  • , Wenxin Yu*
  • , Lu Che
  • , Zhiqiang Zhang
  • , Hongjie Cai
  • , Zhiyu Deng
  • , Jun Gong
  • , Peng Chen
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we propose a novel Equipped with Threshold Matching-Aware Generative Adversarial Network (ETMA-GAN) for text-to-image synthesis. By filtering inaccurate negative samples, the discriminator can more accurately determine whether the generator has generated the images correctly according to the descriptions. In addition, to enhance the discriminative model’s ability to discriminate and capture key semantic information, a word fine-grained supervisor is constructed, which in turn drives the generative model to achieve high-quality image detail synthesis. Numerous experiments and ablation studies on Caltech-UCSD Birds 200 (CUB) and Microsoft Common Objects in Context (MS COCO) datasets demonstrate the effectiveness and superiority of the proposed method over existing methods. In terms of subjective and objective evaluations, the model presented in this study has more advantages than the recently available state-of-the-art methods, especially regarding synthetic images with a higher degree of realism and better conformity to text descriptions.

Original languageEnglish
Title of host publicationNeural Information Processing - 30th International Conference, ICONIP 2023, Proceedings
EditorsBiao Luo, Long Cheng, Zheng-Guang Wu, Hongyi Li, Chaojie Li
PublisherSpringer Science and Business Media Deutschland GmbH
Pages161-172
Number of pages12
ISBN (Print)9789819981472
DOIs
Publication statusPublished - 2024
Externally publishedYes
Event30th International Conference on Neural Information Processing, ICONIP 2023 - Changsha, China
Duration: 20 Nov 202323 Nov 2023

Publication series

NameCommunications in Computer and Information Science
Volume1966 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference30th International Conference on Neural Information Processing, ICONIP 2023
Country/TerritoryChina
CityChangsha
Period20/11/2323/11/23

Keywords

  • Computer Vision
  • Generative Adversarial Networks
  • Matching-Aware
  • Text-to-Image Synthesis

Fingerprint

Dive into the research topics of 'Text-to-Image Synthesis with Threshold-Equipped Matching-Aware GAN'. Together they form a unique fingerprint.

Cite this