Joint Embedding based Text-To-Image Synthesis

  • Menglan Wang
  • , Yue Yu*
  • , Benyuan Li
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Learning joint embedding between image and text is significant for text-To-image synthesis as it bridges the semantic gap between image and text. Most existing text-To-image generation methods depend on the quality of the text embedding. If the text features are not be extracted well, it is difficult for subsequent processes to generate satisfactory images. However, these methods are disturbed by the text expression form in the process of extracting text features, resulting in the ideal text features cannot be generated well. In this paper, we propose a new text encoder that learns joint embedding to capture semantic information shared by the real images and the input text, and eliminates the interference of textual expression forms. The main difference with existing works is that for different texts describing the same image, although their expressions are different, because they contain the same semantic information, the proposed text encoder extracts the similar semantic features from these different texts. Meanwhile, a special auxiliary classifier for discriminator is adopted to retain low-level features to generate fine-detailed images. We evaluate this work on the Caltech-UCSD Birds 200 (CUB) and the Oxford-102 flower dataset, experiments show that our work has better performance than the state-of-The-Art works.

Original languageEnglish
Title of host publicationProceedings - IEEE 32nd International Conference on Tools with Artificial Intelligence, ICTAI 2020
EditorsMiltos Alamaniotis, Shimei Pan
PublisherIEEE Computer Society
Pages432-436
Number of pages5
ISBN (Electronic)9781728192284
DOIs
Publication statusPublished - Nov 2020
Event32nd IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2020 - Virtual, Baltimore, United States
Duration: 9 Nov 202011 Nov 2020

Publication series

NameProceedings - International Conference on Tools with Artificial Intelligence, ICTAI
Volume2020-November
ISSN (Print)1082-3409

Conference

Conference32nd IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2020
Country/TerritoryUnited States
CityVirtual, Baltimore
Period9/11/2011/11/20

Keywords

  • joint embedding
  • special auxiliary classifier
  • text-To-image synthesis

Fingerprint

Dive into the research topics of 'Joint Embedding based Text-To-Image Synthesis'. Together they form a unique fingerprint.

Cite this