A deep reinforced training method for location-based image captioning

Lei Zhao, Chunxia Zhang, Xi Zhang, Yating Hu*, Zhendong Niu

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Neural encoder-decoder frameworks have been used extensively in image captioning. Recent research has shown that reinforcement learning can be utilized to train these frameworks directly on non-differentiable evaluation metrics. However, the captions generated by this method usually have limited grammaticality and readability. In this paper, we propose a novel model with the location-based mechanism which introduces the location information of each region in the image, and a combined training method that combines the cross entropy loss and reinforcement learning. We evaluate our model on four public benchmarks: Flickr8k, Flickr30k, MSCOCO and Image Chinese Captioning (ICC). Experimental results show that our model can improve the readability of the generated captions and outperforms the state-of-the-art methods across different evaluation metrics.

源语言英语
主期刊名PRICAI 2018
主期刊副标题Trends in Artificial Intelligence - 15th Pacific Rim International Conference on Artificial Intelligence, Proceedings
编辑Byeong-Ho Kang, Xin Geng
出版商Springer Verlag
878-890
页数13
ISBN(印刷版)9783319973036
DOI
出版状态已出版 - 2018
活动15th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2018 - Nanjing, 中国
期限: 28 8月 201831 8月 2018

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
11012 LNAI
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议15th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2018
国家/地区中国
Nanjing
时期28/08/1831/08/18

指纹

探究 'A deep reinforced training method for location-based image captioning' 的科研主题。它们共同构成独一无二的指纹。

引用此