Locally densest subgraph discovery

Lu Qin; Rong Hua Li; Lijun Chang; Chengqi Zhang

doi:10.1145/2783258.2783299

Locally densest subgraph discovery

Lu Qin, Rong Hua Li, Lijun Chang, Chengqi Zhang

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

64 Citations (Scopus)

Abstract

Mining dense subgraphs from a large graph is a fundamental graph mining task and can be widely applied in a variety of application domains such as network science, biology, graph database, web mining, graph compression, and micro-blogging systems. Here a dense subgraph is defined as a subgraph with high density (#.edge/#.node). Existing studies of this problem either focus on finding the densest subgraph or identifying an optimal clique-like dense subgraph, and they adopt a simple greedy approach to find the top-k dense subgraphs. However, their identified subgraphs cannot be used to represent the dense regions of the graph. Intuitively, to represent a dense region, the subgraph identified should be the subgraph with highest density in its local region in the graph. However, it is non-trivial to formally model a locally densest subgraph. In this paper, we aim to discover top-k such representative locally densest subgraphs of a graph. We provide an elegant parameter-free definition of a locally densest subgraph. The definition not only fits well with the intuition, but is also associated with several nice structural properties. We show that the set of locally densest subgraphs in a graph can be computed in polynomial time. We further propose three novel pruning strategies to largely reduce the search space of the algorithm. In our experiments, we use several real datasets with various graph properties to evaluate the effectiveness of our model using four quality measures and a case study. We also test our algorithms on several real web-scale graphs, one of which contains 118.14 million nodes and 1.02 billion edges, to demonstrate the high efficiency of the proposed algorithms.

Original language	English
Title of host publication	KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Publisher	Association for Computing Machinery
Pages	965-974
Number of pages	10
ISBN (Electronic)	9781450336642
DOIs	https://doi.org/10.1145/2783258.2783299
Publication status	Published - 10 Aug 2015
Externally published	Yes
Event	21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015 - Sydney, Australia Duration: 10 Aug 2015 → 13 Aug 2015

Publication series

Name	Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume	2015-August

Conference

Conference	21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015
Country/Territory	Australia
City	Sydney
Period	10/08/15 → 13/08/15

Keywords

Big data
Dense subgraph
Graph

Access to Document

10.1145/2783258.2783299

Cite this

Qin, L., Li, R. H., Chang, L., & Zhang, C. (2015). Locally densest subgraph discovery. In KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 965-974). (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Vol. 2015-August). Association for Computing Machinery. https://doi.org/10.1145/2783258.2783299

@inproceedings{4129f9e4bc684cba9d8a6a4ce89187df,

title = "Locally densest subgraph discovery",

abstract = "Mining dense subgraphs from a large graph is a fundamental graph mining task and can be widely applied in a variety of application domains such as network science, biology, graph database, web mining, graph compression, and micro-blogging systems. Here a dense subgraph is defined as a subgraph with high density (#.edge/#.node). Existing studies of this problem either focus on finding the densest subgraph or identifying an optimal clique-like dense subgraph, and they adopt a simple greedy approach to find the top-k dense subgraphs. However, their identified subgraphs cannot be used to represent the dense regions of the graph. Intuitively, to represent a dense region, the subgraph identified should be the subgraph with highest density in its local region in the graph. However, it is non-trivial to formally model a locally densest subgraph. In this paper, we aim to discover top-k such representative locally densest subgraphs of a graph. We provide an elegant parameter-free definition of a locally densest subgraph. The definition not only fits well with the intuition, but is also associated with several nice structural properties. We show that the set of locally densest subgraphs in a graph can be computed in polynomial time. We further propose three novel pruning strategies to largely reduce the search space of the algorithm. In our experiments, we use several real datasets with various graph properties to evaluate the effectiveness of our model using four quality measures and a case study. We also test our algorithms on several real web-scale graphs, one of which contains 118.14 million nodes and 1.02 billion edges, to demonstrate the high efficiency of the proposed algorithms.",

keywords = "Big data, Dense subgraph, Graph",

author = "Lu Qin and Li, {Rong Hua} and Lijun Chang and Chengqi Zhang",

note = "Publisher Copyright: {\textcopyright} 2015 ACM.; 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015 ; Conference date: 10-08-2015 Through 13-08-2015",

year = "2015",

month = aug,

day = "10",

doi = "10.1145/2783258.2783299",

language = "English",

series = "Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",

publisher = "Association for Computing Machinery",

pages = "965--974",

booktitle = "KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining",

}

Qin, L, Li, RH, Chang, L & Zhang, C 2015, Locally densest subgraph discovery. in KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 2015-August, Association for Computing Machinery, pp. 965-974, 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015, Sydney, Australia, 10/08/15. https://doi.org/10.1145/2783258.2783299

Locally densest subgraph discovery. / Qin, Lu; Li, Rong Hua; Chang, Lijun et al.
KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 2015. p. 965-974 (Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Vol. 2015-August).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Locally densest subgraph discovery

AU - Qin, Lu

AU - Li, Rong Hua

AU - Chang, Lijun

AU - Zhang, Chengqi

PY - 2015/8/10

Y1 - 2015/8/10

N2 - Mining dense subgraphs from a large graph is a fundamental graph mining task and can be widely applied in a variety of application domains such as network science, biology, graph database, web mining, graph compression, and micro-blogging systems. Here a dense subgraph is defined as a subgraph with high density (#.edge/#.node). Existing studies of this problem either focus on finding the densest subgraph or identifying an optimal clique-like dense subgraph, and they adopt a simple greedy approach to find the top-k dense subgraphs. However, their identified subgraphs cannot be used to represent the dense regions of the graph. Intuitively, to represent a dense region, the subgraph identified should be the subgraph with highest density in its local region in the graph. However, it is non-trivial to formally model a locally densest subgraph. In this paper, we aim to discover top-k such representative locally densest subgraphs of a graph. We provide an elegant parameter-free definition of a locally densest subgraph. The definition not only fits well with the intuition, but is also associated with several nice structural properties. We show that the set of locally densest subgraphs in a graph can be computed in polynomial time. We further propose three novel pruning strategies to largely reduce the search space of the algorithm. In our experiments, we use several real datasets with various graph properties to evaluate the effectiveness of our model using four quality measures and a case study. We also test our algorithms on several real web-scale graphs, one of which contains 118.14 million nodes and 1.02 billion edges, to demonstrate the high efficiency of the proposed algorithms.

AB - Mining dense subgraphs from a large graph is a fundamental graph mining task and can be widely applied in a variety of application domains such as network science, biology, graph database, web mining, graph compression, and micro-blogging systems. Here a dense subgraph is defined as a subgraph with high density (#.edge/#.node). Existing studies of this problem either focus on finding the densest subgraph or identifying an optimal clique-like dense subgraph, and they adopt a simple greedy approach to find the top-k dense subgraphs. However, their identified subgraphs cannot be used to represent the dense regions of the graph. Intuitively, to represent a dense region, the subgraph identified should be the subgraph with highest density in its local region in the graph. However, it is non-trivial to formally model a locally densest subgraph. In this paper, we aim to discover top-k such representative locally densest subgraphs of a graph. We provide an elegant parameter-free definition of a locally densest subgraph. The definition not only fits well with the intuition, but is also associated with several nice structural properties. We show that the set of locally densest subgraphs in a graph can be computed in polynomial time. We further propose three novel pruning strategies to largely reduce the search space of the algorithm. In our experiments, we use several real datasets with various graph properties to evaluate the effectiveness of our model using four quality measures and a case study. We also test our algorithms on several real web-scale graphs, one of which contains 118.14 million nodes and 1.02 billion edges, to demonstrate the high efficiency of the proposed algorithms.

KW - Big data

KW - Dense subgraph

KW - Graph

UR - http://www.scopus.com/inward/record.url?scp=84954128641&partnerID=8YFLogxK

U2 - 10.1145/2783258.2783299

DO - 10.1145/2783258.2783299

M3 - Conference contribution

AN - SCOPUS:84954128641

T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

SP - 965

EP - 974

BT - KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

PB - Association for Computing Machinery

T2 - 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015

Y2 - 10 August 2015 through 13 August 2015

ER -

Locally densest subgraph discovery

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this