Generating image description by modeling spatial context of an image

Kan Li, Lin Bai

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

Generating the descriptive sentences of a real image is a challenging task in image understanding. The difficulty mainly lies in recognizing the interaction activities between objects, and predicting the relationship between objects and stuff/scene. In this paper, we propose a framework for improving image description generation by addressing the above problems. Our framework mainly includes two models: a unified spatial context model and an image description generation model. The former, as the centerpiece of our framework, models 3D spatial context to learn the human-object interaction activities and predict the semantic relationship between these activities and stuff/scene. The spatial context model casts the problems as latent structured labeling problems, and can be resolved by a unified mathematical optimization. Then based on the semantic relationship, the image description generation model generates image descriptive sentences through the proposed lexicalized tree-based algorithm. Experiments on a joint dataset show that our framework outperforms state-of-the-art methods in spatial co-occurrence context analysis, the human-object interaction recognition, and the image description generation.

Original languageEnglish
Title of host publication2015 International Joint Conference on Neural Networks, IJCNN 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781479919604, 9781479919604, 9781479919604, 9781479919604
DOIs
Publication statusPublished - 28 Sept 2015
EventInternational Joint Conference on Neural Networks, IJCNN 2015 - Killarney, Ireland
Duration: 12 Jul 201517 Jul 2015

Publication series

NameProceedings of the International Joint Conference on Neural Networks
Volume2015-September

Conference

ConferenceInternational Joint Conference on Neural Networks, IJCNN 2015
Country/TerritoryIreland
CityKillarney
Period12/07/1517/07/15

Keywords

  • Image recognition
  • Layout
  • Semantics

Fingerprint

Dive into the research topics of 'Generating image description by modeling spatial context of an image'. Together they form a unique fingerprint.

Cite this