Recurrent attention LSTM model for image chinese caption generation

Chaoying Zhang, Yaping Dai, Yanyan Cheng, Zhiyang Jia, Kaoru Hirota

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

A Recurrent Attention LSTM model (RAL) is proposed for image Chinese caption generation. The model uses Inception-v4 as CNN model developed by Google to extract image features while the recurrent attention LSTM mechanism determines feature weights. The model can generate words accurately because of adding the weights of image region. Therefore, the proposed model is able to generate more relevant descriptions and improve the efficiency of the system. Compared with Neural Image Caption (NIC) model, the experiment results show that the performance of the proposed model is improved by 1.8% with BLEU-4 metrics and 6.2% with CIDEr metrics on the AI Challenger Image Chinese Captioning dataset.

Original languageEnglish
Title of host publicationProceedings - 2018 Joint 10th International Conference on Soft Computing and Intelligent Systems and 19th International Symposium on Advanced Intelligent Systems, SCIS-ISIS 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages808-813
Number of pages6
ISBN (Electronic)9781538626337
DOIs
Publication statusPublished - 2 Jul 2018
EventJoint 10th International Conference on Soft Computing and Intelligent Systems and 19th International Symposium on Advanced Intelligent Systems, SCIS-ISIS 2018 - Toyama, Japan
Duration: 5 Dec 20188 Dec 2018

Publication series

NameProceedings - 2018 Joint 10th International Conference on Soft Computing and Intelligent Systems and 19th International Symposium on Advanced Intelligent Systems, SCIS-ISIS 2018

Conference

ConferenceJoint 10th International Conference on Soft Computing and Intelligent Systems and 19th International Symposium on Advanced Intelligent Systems, SCIS-ISIS 2018
Country/TerritoryJapan
CityToyama
Period5/12/188/12/18

Keywords

  • Convolutional Neural Network
  • Image Chinese Caption Generation
  • Long Short-Term Memory
  • Recurrent Attention LSTM

Fingerprint

Dive into the research topics of 'Recurrent attention LSTM model for image chinese caption generation'. Together they form a unique fingerprint.

Cite this