Across Images and Graphs for Question Answering

  • Zhenyu Wen
  • , Jiaxu Qian
  • , Bin Qian
  • , Qin Yuan
  • , Jianbin Qin
  • , Qi Xuan*
  • , Ye Yuan
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Cross-source query serves as a proxy for scene understanding to support many web applications such as rec-ommendation systems, e-commerce, and e-learning applications. In this paper, we propose SVQA that semantically combines the knowledge from available images and graphs to answer the complex question. To this end, we design a graph-based method to unify various data sources into one representation. We then develop a complex question parse method that utilizes the structure of languages to transform the query into a query graph. A graph query engine that performs the query graph over the unified data source while optimizing the query process. To evaluate the proposed system, we build a vanilla dataset called MVQA and show that the state-of-the-art (SOTA) VQA models fail to perform our task. The comprehensive evaluations show that the proposed SVQA is able to reason implicit relationships over multiple images and external knowledge to correctly answer a complex query. We hope that our first attempt provides researchers with a fresh taste of multimodal data analysis.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024
PublisherIEEE Computer Society
Pages1366-1379
Number of pages14
ISBN (Electronic)9798350317152
DOIs
Publication statusPublished - 2024
Event40th IEEE International Conference on Data Engineering, ICDE 2024 - Utrecht, Netherlands
Duration: 13 May 202417 May 2024

Publication series

NameProceedings - International Conference on Data Engineering
ISSN (Print)1084-4627
ISSN (Electronic)2375-0286

Conference

Conference40th IEEE International Conference on Data Engineering, ICDE 2024
Country/TerritoryNetherlands
CityUtrecht
Period13/05/2417/05/24

Keywords

  • Data Mining and Knowledge Discovery
  • Query Processing. Indexing and Optimization
  • Text. Semi-Structured Data. IR. Image. and Multimedia databases

Fingerprint

Dive into the research topics of 'Across Images and Graphs for Question Answering'. Together they form a unique fingerprint.

Cite this