Unsupervised Style Control for Image Captioning

Junyu Tian; Zhikun Yang; Shumin Shi

doi:10.1007/978-981-19-5194-7_31

Unsupervised Style Control for Image Captioning

Junyu Tian, Zhikun Yang, Shumin Shi^*

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

1 引用（Scopus）

摘要

We propose a novel unsupervised image captioning method. Image captioning involves two fields of deep learning, natural language processing and computer vision. The excessive pursuit of model evaluation results makes the caption style generated by the model too monotonous, which is difficult to meet people’s demands for vivid and stylized image captions. Therefore, we propose an image captioning model that combines text style transfer and image emotion recognition methods, with which the model can better understand images and generate controllable stylized captions. The proposed method can automatically judge the emotion contained in the image through the image emotion recognition module, better understand the image content, and control the description through the text style transfer method, thereby generating captions that meet people’s expectations. To our knowledge, this is the first work to use both image emotion recognition and text style control.

源语言	英语
主期刊名	Data Science - 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022, Proceedings
编辑	Yang Wang, Guobin Zhu, Qilong Han, Hongzhi Wang, Xianhua Song, Zeguang Lu
出版商	Springer Science and Business Media Deutschland GmbH
页	413-424
页数	12
ISBN（印刷版）	9789811951930
DOI	https://doi.org/10.1007/978-981-19-5194-7_31
出版状态	已出版 - 2022
活动	8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022 - Chengdu, 中国期限: 19 8月 2022 → 22 8月 2022

出版系列

姓名	Communications in Computer and Information Science
卷	1628 CCIS
ISSN（印刷版）	1865-0929
ISSN（电子版）	1865-0937

会议

会议	8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022
国家/地区	中国
市	Chengdu
时期	19/08/22 → 22/08/22

访问文件

10.1007/978-981-19-5194-7_31

其它文件与链接

链接到 Scopus 的出版物

引用此

Tian, J., Yang, Z., & Shi, S. (2022). Unsupervised Style Control for Image Captioning. 在 Y. Wang, G. Zhu, Q. Han, H. Wang, X. Song, & Z. Lu (编辑), Data Science - 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022, Proceedings (页码 413-424). (Communications in Computer and Information Science; 卷 1628 CCIS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-19-5194-7_31

Tian, Junyu ; Yang, Zhikun ; Shi, Shumin. / Unsupervised Style Control for Image Captioning. Data Science - 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022, Proceedings. 编辑 / Yang Wang ; Guobin Zhu ; Qilong Han ; Hongzhi Wang ; Xianhua Song ; Zeguang Lu. Springer Science and Business Media Deutschland GmbH, 2022. 页码 413-424 (Communications in Computer and Information Science).

@inproceedings{607edc247c83480db108df31d3d7edb8,

title = "Unsupervised Style Control for Image Captioning",

abstract = "We propose a novel unsupervised image captioning method. Image captioning involves two fields of deep learning, natural language processing and computer vision. The excessive pursuit of model evaluation results makes the caption style generated by the model too monotonous, which is difficult to meet people{\textquoteright}s demands for vivid and stylized image captions. Therefore, we propose an image captioning model that combines text style transfer and image emotion recognition methods, with which the model can better understand images and generate controllable stylized captions. The proposed method can automatically judge the emotion contained in the image through the image emotion recognition module, better understand the image content, and control the description through the text style transfer method, thereby generating captions that meet people{\textquoteright}s expectations. To our knowledge, this is the first work to use both image emotion recognition and text style control.",

keywords = "Image caption, Image sentiment recognization, Text style transfer",

author = "Junyu Tian and Zhikun Yang and Shumin Shi",

note = "Publisher Copyright: {\textcopyright} 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.; 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022 ; Conference date: 19-08-2022 Through 22-08-2022",

year = "2022",

doi = "10.1007/978-981-19-5194-7_31",

language = "English",

isbn = "9789811951930",

series = "Communications in Computer and Information Science",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "413--424",

editor = "Yang Wang and Guobin Zhu and Qilong Han and Hongzhi Wang and Xianhua Song and Zeguang Lu",

booktitle = "Data Science - 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022, Proceedings",

address = "Germany",

}

Tian, J, Yang, Z & Shi, S 2022, Unsupervised Style Control for Image Captioning. 在 Y Wang, G Zhu, Q Han, H Wang, X Song & Z Lu (编辑), Data Science - 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022, Proceedings. Communications in Computer and Information Science, 卷 1628 CCIS, Springer Science and Business Media Deutschland GmbH, 页码 413-424, 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022, Chengdu, 中国, 19/08/22. https://doi.org/10.1007/978-981-19-5194-7_31

Unsupervised Style Control for Image Captioning. / Tian, Junyu; Yang, Zhikun; Shi, Shumin.
Data Science - 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022, Proceedings. 编辑 / Yang Wang; Guobin Zhu; Qilong Han; Hongzhi Wang; Xianhua Song; Zeguang Lu. Springer Science and Business Media Deutschland GmbH, 2022. 页码 413-424 (Communications in Computer and Information Science; 卷 1628 CCIS).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Unsupervised Style Control for Image Captioning

AU - Tian, Junyu

AU - Yang, Zhikun

AU - Shi, Shumin

PY - 2022

Y1 - 2022

N2 - We propose a novel unsupervised image captioning method. Image captioning involves two fields of deep learning, natural language processing and computer vision. The excessive pursuit of model evaluation results makes the caption style generated by the model too monotonous, which is difficult to meet people’s demands for vivid and stylized image captions. Therefore, we propose an image captioning model that combines text style transfer and image emotion recognition methods, with which the model can better understand images and generate controllable stylized captions. The proposed method can automatically judge the emotion contained in the image through the image emotion recognition module, better understand the image content, and control the description through the text style transfer method, thereby generating captions that meet people’s expectations. To our knowledge, this is the first work to use both image emotion recognition and text style control.

AB - We propose a novel unsupervised image captioning method. Image captioning involves two fields of deep learning, natural language processing and computer vision. The excessive pursuit of model evaluation results makes the caption style generated by the model too monotonous, which is difficult to meet people’s demands for vivid and stylized image captions. Therefore, we propose an image captioning model that combines text style transfer and image emotion recognition methods, with which the model can better understand images and generate controllable stylized captions. The proposed method can automatically judge the emotion contained in the image through the image emotion recognition module, better understand the image content, and control the description through the text style transfer method, thereby generating captions that meet people’s expectations. To our knowledge, this is the first work to use both image emotion recognition and text style control.

KW - Image caption

KW - Image sentiment recognization

KW - Text style transfer

UR - http://www.scopus.com/inward/record.url?scp=85136789375&partnerID=8YFLogxK

U2 - 10.1007/978-981-19-5194-7_31

DO - 10.1007/978-981-19-5194-7_31

M3 - Conference contribution

AN - SCOPUS:85136789375

SN - 9789811951930

T3 - Communications in Computer and Information Science

SP - 413

EP - 424

BT - Data Science - 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022, Proceedings

A2 - Wang, Yang

A2 - Zhu, Guobin

A2 - Han, Qilong

A2 - Wang, Hongzhi

A2 - Song, Xianhua

A2 - Lu, Zeguang

PB - Springer Science and Business Media Deutschland GmbH

T2 - 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022

Y2 - 19 August 2022 through 22 August 2022

ER -

Tian J, Yang Z, Shi S. Unsupervised Style Control for Image Captioning. 在 Wang Y, Zhu G, Han Q, Wang H, Song X, Lu Z, 编辑, Data Science - 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022, Proceedings. Springer Science and Business Media Deutschland GmbH. 2022. 页码 413-424. (Communications in Computer and Information Science). doi: 10.1007/978-981-19-5194-7_31

Unsupervised Style Control for Image Captioning

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此