Joint Embedding based Text-To-Image Synthesis

Menglan Wang, Yue Yu*, Benyuan Li

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Learning joint embedding between image and text is significant for text-To-image synthesis as it bridges the semantic gap between image and text. Most existing text-To-image generation methods depend on the quality of the text embedding. If the text features are not be extracted well, it is difficult for subsequent processes to generate satisfactory images. However, these methods are disturbed by the text expression form in the process of extracting text features, resulting in the ideal text features cannot be generated well. In this paper, we propose a new text encoder that learns joint embedding to capture semantic information shared by the real images and the input text, and eliminates the interference of textual expression forms. The main difference with existing works is that for different texts describing the same image, although their expressions are different, because they contain the same semantic information, the proposed text encoder extracts the similar semantic features from these different texts. Meanwhile, a special auxiliary classifier for discriminator is adopted to retain low-level features to generate fine-detailed images. We evaluate this work on the Caltech-UCSD Birds 200 (CUB) and the Oxford-102 flower dataset, experiments show that our work has better performance than the state-of-The-Art works.

源语言英语
主期刊名Proceedings - IEEE 32nd International Conference on Tools with Artificial Intelligence, ICTAI 2020
编辑Miltos Alamaniotis, Shimei Pan
出版商IEEE Computer Society
432-436
页数5
ISBN(电子版)9781728192284
DOI
出版状态已出版 - 11月 2020
活动32nd IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2020 - Virtual, Baltimore, 美国
期限: 9 11月 202011 11月 2020

出版系列

姓名Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI
2020-November
ISSN(印刷版)1082-3409

会议

会议32nd IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2020
国家/地区美国
Virtual, Baltimore
时期9/11/2011/11/20

指纹

探究 'Joint Embedding based Text-To-Image Synthesis' 的科研主题。它们共同构成独一无二的指纹。

引用此