Efficient k-Clique Counting on Large Graphs: The Power of Color-Based Sampling Approaches

Xiaowei Ye, Rong Hua Li*, Qiangqiang Dai, Hongzhi Chen, Guoren Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

KK-clique counting is a fundamental problem in network analysis which has attracted much attention in recent years. Computing the count of kk-cliques in a graph for a large kk (e.g., k=8k=8) is often intractable as the number of kk-cliques increases exponentially w.r.t. (with respect to) kk. Existing exact kk-clique counting algorithms are often hard to handle large dense graphs, while sampling-based solutions either require a huge number of samples or consume very high storage space to achieve a satisfactory accuracy. To overcome these limitations, we propose a new framework to estimate the number of kk-cliques which integrates both the exact kk-clique counting technique and three novel color-based sampling techniques. The key insight of our framework is that we only apply the exact algorithm to compute the kk-clique counts in the sparse regions of a graph, and use the proposed color-based sampling approaches to estimate the number of kk-cliques in the dense regions of the graph. Specifically, we develop three novel dynamic programming based kk-color set sampling techniques to efficiently estimate the kk-clique counts, where a kk-color set contains kk nodes with kk different colors. Since a kk-color set is often a good approximation of a kk-clique in the dense regions of a graph, our sampling-based solutions are extremely efficient and accurate. Moreover, the proposed sampling techniques are space efficient which use near-linear space w.r.t. graph size. We conduct extensive experiments to evaluate our algorithms using 8 real-life graphs. The results show that our best algorithm is at least one order of magnitude faster than the state-of-the-art sampling-based solutions (with the same relative error 0.1%) and can be up to three orders of magnitude faster than the state-of-the-art exact algorithm on large graphs.

Original languageEnglish
Pages (from-to)1518-1536
Number of pages19
JournalIEEE Transactions on Knowledge and Data Engineering
Volume36
Issue number4
DOIs
Publication statusPublished - 1 Apr 2024

Keywords

  • cohesive subgraphs
  • dynamic programming
  • graph coloring
  • graph sampling
  • k-clique counting

Fingerprint

Dive into the research topics of 'Efficient k-Clique Counting on Large Graphs: The Power of Color-Based Sampling Approaches'. Together they form a unique fingerprint.

Cite this