IMAGE CAPTIONING WITH INHERENT SENTIMENT

Tong Li, Yunhui Hu, Xinxiao Wu*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)

Abstract

We propose a new task called sentimental image captioning which aims to generate captions with the inherent sentiment reflected by the image. Compared with the stylized image captioning task that requires a predefined style independent of the image, our new task can automatically analyze the inherent sentiment tendency within the image. With this in mind, we propose an Inherent Sentiment Image Captioning (InSenti-Cap) method that first extracts the content and sentiment information from the image, and then fuses these information into the sentimental sentence generation via an attention mechanism. To effectively train the proposed model using the pairs of image and factual caption in existing captioning dataset and the extra sentiment corpus, we propose a two-stage training strategy that involves a sentimental regularization and a sentimental reward to enable the model to generate fluent and relevant sentences with inherent sentimental styles. Experiments demonstrate the effectiveness of our method.

Original languageEnglish
Title of host publication2021 IEEE International Conference on Multimedia and Expo, ICME 2021
PublisherIEEE Computer Society
ISBN (Electronic)9781665438643
DOIs
Publication statusPublished - 2021
Event2021 IEEE International Conference on Multimedia and Expo, ICME 2021 - Shenzhen, China
Duration: 5 Jul 20219 Jul 2021

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2021 IEEE International Conference on Multimedia and Expo, ICME 2021
Country/TerritoryChina
CityShenzhen
Period5/07/219/07/21

Keywords

  • Image Captioning
  • Image Sentiment Analysis
  • Sentimental Image Captioning

Fingerprint

Dive into the research topics of 'IMAGE CAPTIONING WITH INHERENT SENTIMENT'. Together they form a unique fingerprint.

Cite this