跳到主要导航 跳到搜索 跳到主要内容

MemCap: Memorizing style knowledge for image captioning

  • Wentian Zhao
  • , Xinxiao Wu*
  • , Xiaoxun Zhang
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Generating stylized captions for images is a challenging task since it requires not only describing the content of the image accurately but also expressing the desired linguistic style appropriately. In this paper, we propose MemCap, a novel stylized image captioning method that explicitly encodes the knowledge about linguistic styles with memory mechanism. Rather than relying heavily on a language model to capture style factors in existing methods, our method resorts to memorizing stylized elements learned from training corpus. Particularly, we design a memory module that comprises a set of embedding vectors for encoding style-related phrases in training corpus. To acquire the style-related phrases, we develop a sentence decomposing algorithm that splits a stylized sentence into a style-related part that reflects the linguistic style and a content-related part that contains the visual content. When generating captions, our MemCap first extracts content-relevant style knowledge from the memory module via an attention mechanism and then incorporates the extracted knowledge into a language model. Extensive experiments on two stylized image captioning datasets (SentiCap and FlickrStyle10K) demonstrate the effectiveness of our method.

源语言英语
主期刊名AAAI 2020 - 34th AAAI Conference on Artificial Intelligence
出版商AAAI press
12984-12992
页数9
ISBN(电子版)9781577358350
出版状态已出版 - 2020
活动34th AAAI Conference on Artificial Intelligence, AAAI 2020 - New York, 美国
期限: 7 2月 202012 2月 2020

出版系列

姓名AAAI 2020 - 34th AAAI Conference on Artificial Intelligence

会议

会议34th AAAI Conference on Artificial Intelligence, AAAI 2020
国家/地区美国
New York
时期7/02/2012/02/20

指纹

探究 'MemCap: Memorizing style knowledge for image captioning' 的科研主题。它们共同构成独一无二的指纹。

引用此