Unsupervised Style Control for Image Captioning

Junyu Tian, Zhikun Yang, Shumin Shi*

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

We propose a novel unsupervised image captioning method. Image captioning involves two fields of deep learning, natural language processing and computer vision. The excessive pursuit of model evaluation results makes the caption style generated by the model too monotonous, which is difficult to meet people’s demands for vivid and stylized image captions. Therefore, we propose an image captioning model that combines text style transfer and image emotion recognition methods, with which the model can better understand images and generate controllable stylized captions. The proposed method can automatically judge the emotion contained in the image through the image emotion recognition module, better understand the image content, and control the description through the text style transfer method, thereby generating captions that meet people’s expectations. To our knowledge, this is the first work to use both image emotion recognition and text style control.

源语言英语
主期刊名Data Science - 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022, Proceedings
编辑Yang Wang, Guobin Zhu, Qilong Han, Hongzhi Wang, Xianhua Song, Zeguang Lu
出版商Springer Science and Business Media Deutschland GmbH
413-424
页数12
ISBN(印刷版)9789811951930
DOI
出版状态已出版 - 2022
活动8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022 - Chengdu, 中国
期限: 19 8月 202222 8月 2022

出版系列

姓名Communications in Computer and Information Science
1628 CCIS
ISSN(印刷版)1865-0929
ISSN(电子版)1865-0937

会议

会议8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022
国家/地区中国
Chengdu
时期19/08/2222/08/22

指纹

探究 'Unsupervised Style Control for Image Captioning' 的科研主题。它们共同构成独一无二的指纹。

引用此