TY - JOUR
T1 - DistR
T2 - A Distributed Method for the Reachability Query over Large Uncertain Graphs
AU - Cheng, Yurong
AU - Yuan, Ye
AU - Chen, Lei
AU - Wang, Guoren
AU - Giraud-Carrier, Christophe
AU - Sun, Yongjiao
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/11/1
Y1 - 2016/11/1
N2 - Among uncertain graph queries, reachability, i.e., the probability that one vertex is reachable from another, is likely the most fundamental one. Although this problem has been studied within the field of network reliability, solutions are implemented on a single computer and can only handle small graphs. However, as the size of graph applications continually increases, the corresponding graph data can no longer fit within a single computer's memory and must therefore be distributed across several machines. Furthermore, the computation of probabilistic reachability queries is #P-complete making it very expensive even on small graphs. In this paper, we develop an efficient distributed strategy, called DistR, to solve the problem of reachability query over large uncertain graphs. Specifically, we perform the task in two steps: distributed graph reduction and distributed consolidation. In the distributed graph reduction step, we find all of the maximal subgraphs of the original graph, whose reachability probabilities can be calculated in polynomial time, compute them and reduce the graph accordingly. After this step, only a small graph remains. In the distributed consolidation step, we transform the problem into a relational join process and provide an approximate answer to the #P-complete reachability query. Extensive experimental studies show that our distributed approach is efficient in terms of both computational and communication costs, and has high accuracy.
AB - Among uncertain graph queries, reachability, i.e., the probability that one vertex is reachable from another, is likely the most fundamental one. Although this problem has been studied within the field of network reliability, solutions are implemented on a single computer and can only handle small graphs. However, as the size of graph applications continually increases, the corresponding graph data can no longer fit within a single computer's memory and must therefore be distributed across several machines. Furthermore, the computation of probabilistic reachability queries is #P-complete making it very expensive even on small graphs. In this paper, we develop an efficient distributed strategy, called DistR, to solve the problem of reachability query over large uncertain graphs. Specifically, we perform the task in two steps: distributed graph reduction and distributed consolidation. In the distributed graph reduction step, we find all of the maximal subgraphs of the original graph, whose reachability probabilities can be calculated in polynomial time, compute them and reduce the graph accordingly. After this step, only a small graph remains. In the distributed consolidation step, we transform the problem into a relational join process and provide an approximate answer to the #P-complete reachability query. Extensive experimental studies show that our distributed approach is efficient in terms of both computational and communication costs, and has high accuracy.
KW - Reachability
KW - distributed computing
KW - uncertain graphs
UR - http://www.scopus.com/inward/record.url?scp=84994410001&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2016.2535444
DO - 10.1109/TPDS.2016.2535444
M3 - Article
AN - SCOPUS:84994410001
SN - 1045-9219
VL - 27
SP - 3172
EP - 3185
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 11
M1 - 7420710
ER -