TY - JOUR
T1 - Provable Higher-order Graph Clustering
T2 - the Power of Peeling-based Approaches
AU - Lin, Longlong
AU - Wang, Zeli
AU - Li, Rong Hua
AU - Liu, Qiyu
AU - Qin, Hongchao
AU - Zhao, Jin
N1 - Publisher Copyright:
© 1989-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Higher-order graph clustering partitions graphs use frequently occurring subgraphs instead of edges, proving effective in community detection and knowledge discovery. Motif conductance, known for its strong interpretability, is a leading model. However, existing motif conductance algorithms are hindered by a two-stage reweighting framework that requires enumerating motif instances to generate an edge-weighted graph for partitioning. This framework has two major drawbacks: (1) It provides only a quadratic bound for three-vertex motifs, with no provable approximation guarantees for other motifs. (2) Enumerating motif instances is computationally prohibitive for large motifs or dense graphs due to combinatorial explosions. Besides, costly spectral clustering or local graph diffusion on the edge-weighted graph limits their scalability. In this paper, we propose a novel peeling-based clustering framework, PSMC, offering a motifindependent approximation ratio for any motif. Specifically, PSMC first defines a new locally computable vertex metric Motif Resident based on the given motif. Then, it iteratively deletes vertices with the smallest motif resident using efficient dynamic update techniques, outputting a locally optimal result with approximation guarantees. Besides, we introduce several powerful optimization techniques to further reduce computational costs. Empirical results on real-world and synthetic datasets showcase our proposed solutions' superiority over ten competitors.
AB - Higher-order graph clustering partitions graphs use frequently occurring subgraphs instead of edges, proving effective in community detection and knowledge discovery. Motif conductance, known for its strong interpretability, is a leading model. However, existing motif conductance algorithms are hindered by a two-stage reweighting framework that requires enumerating motif instances to generate an edge-weighted graph for partitioning. This framework has two major drawbacks: (1) It provides only a quadratic bound for three-vertex motifs, with no provable approximation guarantees for other motifs. (2) Enumerating motif instances is computationally prohibitive for large motifs or dense graphs due to combinatorial explosions. Besides, costly spectral clustering or local graph diffusion on the edge-weighted graph limits their scalability. In this paper, we propose a novel peeling-based clustering framework, PSMC, offering a motifindependent approximation ratio for any motif. Specifically, PSMC first defines a new locally computable vertex metric Motif Resident based on the given motif. Then, it iteratively deletes vertices with the smallest motif resident using efficient dynamic update techniques, outputting a locally optimal result with approximation guarantees. Besides, we introduce several powerful optimization techniques to further reduce computational costs. Empirical results on real-world and synthetic datasets showcase our proposed solutions' superiority over ten competitors.
KW - cohesive subgraphs
KW - Graph clustering
KW - motif
UR - http://www.scopus.com/inward/record.url?scp=105008195312&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2025.3579811
DO - 10.1109/TKDE.2025.3579811
M3 - Article
AN - SCOPUS:105008195312
SN - 1041-4347
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
ER -