Exploiting Knowledge Embedding to Improve the Description for Image Captioning

Dandan Song*, Cuimei Peng, Huan Yang, Lejian Liao

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Most existing methods for image captioning are based on the encoder-decoder framework which directly translates visual features into sentences, without exploiting commonsense knowledge available in the form of knowledge graph. Inspired by the success of information retrieval and question answering systems that leverage prior knowledge, we explore a knowledge embedding approach for image captioning. In this paper, we propose a Knowledge Embedding with Attention on Attention (KE-AoA) method for image captioning, which judges whether or how well the objects are related and augments semantic correlations and constraints between them. The KE-AoA method combines knowledge base method (TransE) and text method (Skip-gram), adding external knowledge graph information (triplets) into the language model to guide the learning of word vectors as the regularization term. Then it employs the AoA module to model the relations among different objects. As more inherent relations and commonsense knowledge are learned, the model can generate better image descriptions. The experiments on MSCOCO data sets achieve a significant improvement on the existing methods and validate the effectiveness of our prior knowledge-based approach.

源语言英语
主期刊名Knowledge Graph and Semantic Computing
主期刊副标题Knowledge Graph and Cognitive Intelligence - 5th China Conference, CCKS 2020, Revised Selected Papers
编辑Huajun Chen, Kang Liu, Yizhou Sun, Suge Wang, Lei Hou
出版商Springer Science and Business Media Deutschland GmbH
312-321
页数10
ISBN(印刷版)9789811619632
DOI
出版状态已出版 - 2021
活动5th China Conference on Knowledge Graph, and Semantic Computing, CCKS 2020 - Nanchang, 中国
期限: 12 11月 202015 11月 2020

出版系列

姓名Communications in Computer and Information Science
1356 CCIS
ISSN(印刷版)1865-0929
ISSN(电子版)1865-0937

会议

会议5th China Conference on Knowledge Graph, and Semantic Computing, CCKS 2020
国家/地区中国
Nanchang
时期12/11/2015/11/20

指纹

探究 'Exploiting Knowledge Embedding to Improve the Description for Image Captioning' 的科研主题。它们共同构成独一无二的指纹。

引用此