TY - JOUR
T1 - Edge-based Local Push for Personalized PageRank
AU - Wang, Hanzhi
AU - Wei, Zhewei
AU - Gan, Junhao
AU - Yuan, Ye
AU - Du, Xiaoyong
AU - Wen, Ji Rong
N1 - Publisher Copyright:
© 2022, American Mathematical Society. All rights reserved.
PY - 2022
Y1 - 2022
N2 - Personalized PageRank (PPR) is a popular node proximity metric in graph mining and network research. A single-source PPR (SSPPR) query asks for the PPR value of each node on the graph. Due to its importance and wide applications, decades of efforts have been devoted to the efficient processing of SSPPR queries. Among existing algorithms, LocalPush is a fundamental method for SSPPR queries and serves as a cornerstone for subsequent algorithms. In LocalPush, a push operation is a crucial primitive operation, which distributes the probability at a node u to ALL u’s neighbors via the corresponding edges. Although this push operation works well on unweighted graphs, unfortunately, it can be rather inefficient on weighted graphs. In particular, on unbalanced weighted graphs where only a few of these edges take the majority of the total weight among them, the push operation would have to distribute “insignif-icant” probabilities along those edges which just take the minor weights, resulting in expensive overhead. To resolve this issue, in this paper, we propose the EdgePush algorithm, a novel method for computing SSPPR queries on weighted graphs. EdgePush decomposes the aforementioned push operations in edge-based push, allowing the algorithm to operate at the edge level granularity. As a result, it can flexibly distribute the probabilities according to edge weights. Furthermore, our EdgePush allows a fine-grained termination threshold for each individual edge, leading to a superior complexity over LocalPush. Notably, we prove that EdgePush improves the theoretical query cost of LocalPush by an order of up to O (n) when the graph’s weights are unbalanced. Our experimental results demonstrate that EdgePush significantly outperforms state-of-the-art baselines in terms of query efficiency on large motif-based and real-world weighted graphs.
AB - Personalized PageRank (PPR) is a popular node proximity metric in graph mining and network research. A single-source PPR (SSPPR) query asks for the PPR value of each node on the graph. Due to its importance and wide applications, decades of efforts have been devoted to the efficient processing of SSPPR queries. Among existing algorithms, LocalPush is a fundamental method for SSPPR queries and serves as a cornerstone for subsequent algorithms. In LocalPush, a push operation is a crucial primitive operation, which distributes the probability at a node u to ALL u’s neighbors via the corresponding edges. Although this push operation works well on unweighted graphs, unfortunately, it can be rather inefficient on weighted graphs. In particular, on unbalanced weighted graphs where only a few of these edges take the majority of the total weight among them, the push operation would have to distribute “insignif-icant” probabilities along those edges which just take the minor weights, resulting in expensive overhead. To resolve this issue, in this paper, we propose the EdgePush algorithm, a novel method for computing SSPPR queries on weighted graphs. EdgePush decomposes the aforementioned push operations in edge-based push, allowing the algorithm to operate at the edge level granularity. As a result, it can flexibly distribute the probabilities according to edge weights. Furthermore, our EdgePush allows a fine-grained termination threshold for each individual edge, leading to a superior complexity over LocalPush. Notably, we prove that EdgePush improves the theoretical query cost of LocalPush by an order of up to O (n) when the graph’s weights are unbalanced. Our experimental results demonstrate that EdgePush significantly outperforms state-of-the-art baselines in terms of query efficiency on large motif-based and real-world weighted graphs.
UR - http://www.scopus.com/inward/record.url?scp=85134012203&partnerID=8YFLogxK
U2 - 10.14778/3523210.3523216
DO - 10.14778/3523210.3523216
M3 - Conference article
AN - SCOPUS:85134012203
SN - 2150-8097
VL - 15
SP - 1376
EP - 1389
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 7
T2 - 48th International Conference on Very Large Data Bases, VLDB 2022
Y2 - 5 September 2022 through 9 September 2022
ER -