I/O cost minimization: Reachability queries processing over massive graphs

Zhiwei Zhang*, Jeffrey Xu Yu, Lu Qin, Qing Zhu, Xiaofang Zhou

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Citations (Scopus)

Abstract

Given a directed graph G, a reachability query (u, v) asks whether there exists a path from a node u to a node v in G. The existing studies support reachability queries using indexing techniques, where both the graph and the index are required to reside in main memory. However, they cannot handle reachability queries on massive graphs, when the graph and the index cannot be entirely held in memory because of the high I/O cost. In this paper, we focus on how to minimize the I/O cost when answering reachability queries on massive graphs that cannot reside entirely in memory. First, we propose a new Yes-Label scheme, as a complement of the No-Label used in GRAIL [23], to reduce the number of intermediate results generated. Second, we show how to minimize the number of I/Os using a heap-on-disk data structure when traversing a graph. We also propose new methods to partition the heap-on-disk, in order to ensure that only sequential I/Os are performed. Third, we analyze our approaches and show how to extend our approaches to answer multiple reachability queries effectively. Finally, we conducted extensive performance studies on both large synthetic and large real graphs, and confirm the efficiency of our approaches.

Original languageEnglish
Title of host publicationAdvances in Database Technology - EDBT 2012
Subtitle of host publication15th International Conference on Extending Database Technology, Proceedings
Pages468-479
Number of pages12
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event15th International Conference on Extending Database Technology, EDBT 2012 - Berlin, Germany
Duration: 27 Mar 201230 Mar 2012

Publication series

NameACM International Conference Proceeding Series

Conference

Conference15th International Conference on Extending Database Technology, EDBT 2012
Country/TerritoryGermany
CityBerlin
Period27/03/1230/03/12

Fingerprint

Dive into the research topics of 'I/O cost minimization: Reachability queries processing over massive graphs'. Together they form a unique fingerprint.

Cite this