PSMC: Provable and Scalable Algorithms for Motif Conductance Based Graph Clustering

Longlong Lin, Tao Jia, Zeli Wang, Jin Zhao, Rong Hua Li

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Higher-order graph clustering aims to partition the graph using frequently occurring subgraphs (i.e., motifs), instead of the lower-order edges, as the atomic clustering unit, which has been recognized as the state-of-the-art solution in ground truth community detection and knowledge discovery. Motif conductance is one of the most promising higher-order graph clustering models due to its strong interpretability. However, existing motif conductance based graph clustering algorithms are mainly limited by a seminal two-stage reweighting computing framework, needing to enumerate all motif instances to obtain an edge-weighted graph for partitioning. However, such a framework has two-fold vital defects: (1) It can only provide a quadratic bound for the motif with three vertices, and whether there is provable clustering quality for other motifs is still an open question. (2) The enumeration procedure of motif instances incurs prohibitively high costs against large motifs or large dense graphs due to combinatorial explosions. Besides, expensive spectral clustering or local graph diffusion on the edge-weighted graph also makes existing methods unable to handle massive graphs with millions of nodes. To overcome these dilemmas, we propose a Provable and Scalable Motif Conductance algorithm PSMC, which has a fixed and motif-independent approximation ratio for any motif. Specifically, PSMC first defines a new vertex metric Motif Resident based on the given motif, which can be computed locally. Then, it iteratively deletes the vertex with the smallest motif resident value very efficiently using novel dynamic update technologies. Finally, it outputs the locally optimal result during the above iterative process. To further boost efficiency, we propose several effective bounds to estimate the motif resident value of each vertex, which can greatly reduce computational costs. Empirical results on real-life and synthetic demonstrate that our proposed algorithms achieve 3.2-32 times speedup and improve the quality by at least 12 times than the state-of-the art baselines.

Original languageEnglish
Title of host publicationKDD 2024 - Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages1793-1803
Number of pages11
ISBN (Electronic)9798400704901
DOIs
Publication statusPublished - 24 Aug 2024
Event30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024 - Barcelona, Spain
Duration: 25 Aug 202429 Aug 2024

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
ISSN (Print)2154-817X

Conference

Conference30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024
Country/TerritorySpain
CityBarcelona
Period25/08/2429/08/24

Keywords

  • higher-order graph clustering
  • motif conductance

Fingerprint

Dive into the research topics of 'PSMC: Provable and Scalable Algorithms for Motif Conductance Based Graph Clustering'. Together they form a unique fingerprint.

Cite this