TY - GEN
T1 - Scaling Up Maximal k-plex Enumeration
AU - Dai, Qiangqiang
AU - Li, Rong Hua
AU - Qin, Hongchao
AU - Liao, Meihao
AU - Wang, Guoren
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/10/17
Y1 - 2022/10/17
N2 - Finding all maximal k-plexes on networks is a fundamental research problem in graph analysis due to many important applications, such as community detection, biological graph analysis, and so on. A k-plex is a subgraph in which every vertex is adjacent to all but at most k vertices within the subgraph. In this paper, we study the problem of enumerating all large maximal k-plexes of a graph and develop several new and efficient techniques to solve the problem. Specifically, we first propose several novel upper-bounding techniques to prune unnecessary computations during the enumeration procedure. We show that the proposed upper bounds can be computed in linear time. Then, we develop a new branch-and-bound algorithm with a carefully-designed pivot re-selection strategy to enumerate all k-plexes, which outputs all k-plexes in O(n2?kn) time theoretically, where n is the number of vertices of the graph and ? k is strictly smaller than 2. In addition, a parallel version of the proposed algorithm is further developed to scale up to process large real-world graphs. Finally, extensive experimental results show that the proposed sequential algorithm can achieve up to 2× to 100× speedup over the state-of-the-art sequential algorithms on most benchmark graphs. The results also demonstrate the high scalability of the proposed parallel algorithm. For example, on a large real-world graph with more than 200 million edges, our parallel algorithm can finish the computation within two minutes, while the state-of-the-art parallel algorithm cannot terminate within 24 hours.
AB - Finding all maximal k-plexes on networks is a fundamental research problem in graph analysis due to many important applications, such as community detection, biological graph analysis, and so on. A k-plex is a subgraph in which every vertex is adjacent to all but at most k vertices within the subgraph. In this paper, we study the problem of enumerating all large maximal k-plexes of a graph and develop several new and efficient techniques to solve the problem. Specifically, we first propose several novel upper-bounding techniques to prune unnecessary computations during the enumeration procedure. We show that the proposed upper bounds can be computed in linear time. Then, we develop a new branch-and-bound algorithm with a carefully-designed pivot re-selection strategy to enumerate all k-plexes, which outputs all k-plexes in O(n2?kn) time theoretically, where n is the number of vertices of the graph and ? k is strictly smaller than 2. In addition, a parallel version of the proposed algorithm is further developed to scale up to process large real-world graphs. Finally, extensive experimental results show that the proposed sequential algorithm can achieve up to 2× to 100× speedup over the state-of-the-art sequential algorithms on most benchmark graphs. The results also demonstrate the high scalability of the proposed parallel algorithm. For example, on a large real-world graph with more than 200 million edges, our parallel algorithm can finish the computation within two minutes, while the state-of-the-art parallel algorithm cannot terminate within 24 hours.
KW - branch-and-bound enumeration
KW - cohesive subgragh mining
KW - maximal k-plex
UR - http://www.scopus.com/inward/record.url?scp=85140844856&partnerID=8YFLogxK
U2 - 10.1145/3511808.3557444
DO - 10.1145/3511808.3557444
M3 - Conference contribution
AN - SCOPUS:85140844856
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 345
EP - 354
BT - CIKM 2022 - Proceedings of the 31st ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 31st ACM International Conference on Information and Knowledge Management, CIKM 2022
Y2 - 17 October 2022 through 21 October 2022
ER -