ExactSim: benchmarking single-source SimRank algorithms with high-precision ground truths

Hanzhi Wang, Zhewei Wei*, Yu Liu, Ye Yuan, Xiaoyong Du, Ji Rong Wen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)

Abstract

SimRank is a popular measurement for evaluating the node-to-node similarities based on the graph topology. In recent years, single-source and top-k SimRank queries have received increasing attention due to their applications in web mining, social network analysis, and spam detection. However, a fundamental obstacle in studying SimRank has been the lack of ground truths. The only exact algorithm, Power Method, is computationally infeasible on graphs with more than 10 6 nodes. Consequently, no existing work has evaluated the actual accuracy of various single-source and top-k SimRank algorithms on large real-world graphs. In this paper, we present ExactSim, the first algorithm that computes the exact single-source and top-k SimRank results on large graphs. This algorithm produces ground truths with precision up to 7 decimal places with high probability. With the ground truths computed by ExactSim, we present the first experimental study of the accuracy/cost trade-offs of existing approximate SimRank algorithms on large real-world graphs and synthetic graphs. Finally, we use the ground truths to exploit various properties of SimRank distributions on large graphs.

Original languageEnglish
Pages (from-to)989-1015
Number of pages27
JournalVLDB Journal
Volume30
Issue number6
DOIs
Publication statusPublished - Nov 2021

Keywords

  • Benchmarks
  • Exact computation
  • Ground truths
  • Power-law
  • SimRank
  • Single-source

Fingerprint

Dive into the research topics of 'ExactSim: benchmarking single-source SimRank algorithms with high-precision ground truths'. Together they form a unique fingerprint.

Cite this