A Fast Parallel Community Discovery Model on Complex Networks Through Approximate Optimization

Shaojie Qiao, Nan Han, Yunjun Gao*, Rong Hua Li, Jianbin Huang, Jun Guo, Louis Alberto Gutierrez, Xindong Wu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

54 Citations (Scopus)

Abstract

Community discovery plays an essential role in the analysis of the structural features of complex networks. Since online networks grow increasingly large and complex over time, the methods traditionally used for community discovery cannot efficiently handle large-scale network data. This introduces the important problem of how to effectively and efficiently discover large communities from complex networks. In this study, we propose a fast parallel community discovery model called picaso (a parallel community discovery a lgorithm based on approximate optimization), which integrates two new techniques: (1) Mountain model, which works by utilizing graph theory to approximate the selection of nodes needed for merging, and (2) Landslide algorithm, which is used to update the modularity increment based on the approximated optimization. In addition, the GraphX distribution computing framework is employed in order to achieve parallel community detection over complex networks. In the proposed model, clustering on modularity is used to initialize the Mountain model as well as to compute the weight of each edge in the networks. The relationships among the communities are then simplified by applying the Landslide algorithm, which allows us to obtain the community structures of the complex networks. Extensive experiments were conducted on real and synthetic complex network datasets, and the results demonstrate that the proposed algorithm can outperform the state of the art methods, in effectiveness and efficiency, when working to solve the problem of community detection. Moreover, we demonstratively prove that overall time performance approximates to four times faster than similar approaches. Effectively our results suggest a new paradigm for large-scale community discovery of complex networks.

Original languageEnglish
Article number8283822
Pages (from-to)1638-1651
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume30
Issue number9
DOIs
Publication statusPublished - 1 Sept 2018

Keywords

  • Community discovery
  • approximate optimization
  • complex networks
  • distributed computing
  • graph theory

Fingerprint

Dive into the research topics of 'A Fast Parallel Community Discovery Model on Complex Networks Through Approximate Optimization'. Together they form a unique fingerprint.

Cite this