TY - GEN
T1 - Unsupervised Style Control for Image Captioning
AU - Tian, Junyu
AU - Yang, Zhikun
AU - Shi, Shumin
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
PY - 2022
Y1 - 2022
N2 - We propose a novel unsupervised image captioning method. Image captioning involves two fields of deep learning, natural language processing and computer vision. The excessive pursuit of model evaluation results makes the caption style generated by the model too monotonous, which is difficult to meet people’s demands for vivid and stylized image captions. Therefore, we propose an image captioning model that combines text style transfer and image emotion recognition methods, with which the model can better understand images and generate controllable stylized captions. The proposed method can automatically judge the emotion contained in the image through the image emotion recognition module, better understand the image content, and control the description through the text style transfer method, thereby generating captions that meet people’s expectations. To our knowledge, this is the first work to use both image emotion recognition and text style control.
AB - We propose a novel unsupervised image captioning method. Image captioning involves two fields of deep learning, natural language processing and computer vision. The excessive pursuit of model evaluation results makes the caption style generated by the model too monotonous, which is difficult to meet people’s demands for vivid and stylized image captions. Therefore, we propose an image captioning model that combines text style transfer and image emotion recognition methods, with which the model can better understand images and generate controllable stylized captions. The proposed method can automatically judge the emotion contained in the image through the image emotion recognition module, better understand the image content, and control the description through the text style transfer method, thereby generating captions that meet people’s expectations. To our knowledge, this is the first work to use both image emotion recognition and text style control.
KW - Image caption
KW - Image sentiment recognization
KW - Text style transfer
UR - http://www.scopus.com/inward/record.url?scp=85136789375&partnerID=8YFLogxK
U2 - 10.1007/978-981-19-5194-7_31
DO - 10.1007/978-981-19-5194-7_31
M3 - Conference contribution
AN - SCOPUS:85136789375
SN - 9789811951930
T3 - Communications in Computer and Information Science
SP - 413
EP - 424
BT - Data Science - 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022, Proceedings
A2 - Wang, Yang
A2 - Zhu, Guobin
A2 - Han, Qilong
A2 - Wang, Hongzhi
A2 - Song, Xianhua
A2 - Lu, Zeguang
PB - Springer Science and Business Media Deutschland GmbH
T2 - 8th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2022
Y2 - 19 August 2022 through 22 August 2022
ER -