Video Captioning with Semantic Information from the Knowledge Base

Dan Wang, Dandan Song

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Citations (Scopus)

Abstract

Generating video description is a very challenging task due to the complex spatiotemporal information. Recently, many methods have been proposed by utilizing LSTM to generate sentence for video. Inspired by recent work in machine translation and object detection, we propose a new approach for video captioning which aims to incorporate Knowledge Base information with frame features of the video. We compare and analyze our approach with prior work and show that the large volumes information is available to generate video description. We experiment with our ideas on the S2VT model, and we demonstrate that our method outperforms the state-of-the-art on video captioning benchmarks.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE International Conference on Big Knowledge, ICBK 2017
EditorsXindong Wu, Xindong Wu, Tamer Ozsu, Jim Hendler, Ruqian Lu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages224-229
Number of pages6
ISBN (Electronic)9781538631195
DOIs
Publication statusPublished - 30 Aug 2017
Externally publishedYes
Event8th IEEE International Conference on Big Knowledge, ICBK 2017 - Hefei, China
Duration: 9 Aug 201710 Aug 2017

Publication series

NameProceedings - 2017 IEEE International Conference on Big Knowledge, ICBK 2017

Conference

Conference8th IEEE International Conference on Big Knowledge, ICBK 2017
Country/TerritoryChina
CityHefei
Period9/08/1710/08/17

Keywords

  • Video description
  • knowledge base
  • object detection

Fingerprint

Dive into the research topics of 'Video Captioning with Semantic Information from the Knowledge Base'. Together they form a unique fingerprint.

Cite this