TY - GEN
T1 - Efficient Personalized PageRank Computation
T2 - 2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022
AU - Liao, Meihao
AU - Li, Rong Hua
AU - Dai, Qiangqiang
AU - Wang, Guoren
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/6/10
Y1 - 2022/6/10
N2 - Computing the personalized PageRank vector is a fundamental problem in graph analysis. In this paper, we propose several novel algorithms to efficiently compute the personalized PageRank vector with a decay factor α based on an interesting connection between the personalized PageRank values and the weights of random spanning forests of the graph. Such a connection is derived based on a newly-developed matrix forest theorem on graphs. Based on this, we present an efficient spanning forest sampling algorithm via simulating loop-erased α-random walks to estimate the personalized PageRank vector. Compared to all existing methods, a striking feature of our approach is that its performance is insensitive w.r.t. (with respect to) the parameter α. As a consequence, our algorithm is often much faster than the state-of-the-art algorithms when α is small, which is the demanding case for many graph analysis tasks. We show that our technique can significantly improve the efficiency of the state-of-the-art algorithms for answering two well-studied personalized PageRank queries, including single source query and single target query. Extensive experiments on seven large real-world graphs demonstrate the efficiency of the proposed method.
AB - Computing the personalized PageRank vector is a fundamental problem in graph analysis. In this paper, we propose several novel algorithms to efficiently compute the personalized PageRank vector with a decay factor α based on an interesting connection between the personalized PageRank values and the weights of random spanning forests of the graph. Such a connection is derived based on a newly-developed matrix forest theorem on graphs. Based on this, we present an efficient spanning forest sampling algorithm via simulating loop-erased α-random walks to estimate the personalized PageRank vector. Compared to all existing methods, a striking feature of our approach is that its performance is insensitive w.r.t. (with respect to) the parameter α. As a consequence, our algorithm is often much faster than the state-of-the-art algorithms when α is small, which is the demanding case for many graph analysis tasks. We show that our technique can significantly improve the efficiency of the state-of-the-art algorithms for answering two well-studied personalized PageRank queries, including single source query and single target query. Extensive experiments on seven large real-world graphs demonstrate the efficiency of the proposed method.
KW - personalized pagerank
KW - spanning forest
UR - http://www.scopus.com/inward/record.url?scp=85132787052&partnerID=8YFLogxK
U2 - 10.1145/3514221.3526140
DO - 10.1145/3514221.3526140
M3 - Conference contribution
AN - SCOPUS:85132787052
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 2048
EP - 2061
BT - SIGMOD 2022 - Proceedings of the 2022 International Conference on Management of Data
PB - Association for Computing Machinery
Y2 - 12 June 2022 through 17 June 2022
ER -