Divide & conquer: I/O efficient depth-first search

Zhiwei Zhang, Jeffrey Xu Yu, Lu Qin, Zechao Shang

科研成果: 书/报告/会议事项章节会议稿件同行评审

20 引用 (Scopus)

摘要

Depth-First Search (DFS), which traverses a graph in the depthfirst order, is one of the fundamental graph operations, and the result of DFS over all nodes in G is a spanning tree known as a DFS-Tree. There are many graph algorithms that need DFS such as connected component computation, topological sort, community detection, eulerian path computation, graph bipartiteness testing, planar graph testing, etc, because the in-memory DFS algorithm shows it can be done in linear time w.r.t. the size of G. However, given the fact that real-world graphs grow rapidly in the big data era, the in-memory DFS algorithm cannot be used to handle a large graph that cannot be entirely held in main memory. In this paper, we focus on I/O efficiency and study semi-external algorithms to DFS a graph G which is on disk. Here, like the existing semiexternal algorithms, we assume that a spanning tree of G can be held in main memory and the remaining edges of G are kept on disk, and compute the DFS-Tree in main memory with which DFS can be identified. We propose novel divide & conquer algorithms to DFS over a graph G on disk. In brief, we divide a graph into several subgraphs, compute the DFS-Tree for each subgraph independently, and then merge them together to compute the DFS-Tree for the whole graph. With the global DFS-Tree computed we identify DFS. We discuss the valid division, that can lead to the correct DFS, and the challenges to do so. We propose two division algorithms, named Divide-Star and Divide-TD, and a merge algorithm. We conduct extensive experimental studies using four real massive datasets and several synthetic datasets to confirm the I/O efficiency of our approach.

源语言英语
主期刊名SIGMOD 2015 - Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
出版商Association for Computing Machinery
445-458
页数14
ISBN(电子版)9781450327589
DOI
出版状态已出版 - 27 5月 2015
已对外发布
活动ACM SIGMOD International Conference on Management of Data, SIGMOD 2015 - Melbourne, 澳大利亚
期限: 31 5月 20154 6月 2015

出版系列

姓名Proceedings of the ACM SIGMOD International Conference on Management of Data
2015-May
ISSN(印刷版)0730-8078

会议

会议ACM SIGMOD International Conference on Management of Data, SIGMOD 2015
国家/地区澳大利亚
Melbourne
时期31/05/154/06/15

指纹

探究 'Divide & conquer: I/O efficient depth-first search' 的科研主题。它们共同构成独一无二的指纹。

引用此