Community Search: A Meta-Learning Approach

Shuheng Fang; Kangfei Zhao; Guanghua Li; Jeffrey Xu Yu

doi:10.1109/ICDE55515.2023.00182

Community Search: A Meta-Learning Approach

Shuheng Fang, Kangfei Zhao^*, Guanghua Li, Jeffrey Xu Yu

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

12 引用（Scopus）

摘要

Community Search (CS) is one of the fundamental graph analysis tasks, which is a building block of various real applications. Given any query nodes, CS aims to find cohesive subgraphs that query nodes belong to. Recently, a large number of CS algorithms are designed. These algorithms adopt predefined subgraph patterns to model the communities, which cannot find ground-truth communities that do not have such pre-defined patterns in real-world graphs. Thereby, machine learning (ML) and deep learning (DL) based approaches are proposed to capture flexible community structures by learning from ground-truth communities in a data-driven fashion. These approaches rely on sufficient training data to provide enough generalization for ML models, however, the ground-truth cannot be comprehensively collected beforehand.In this paper, we study ML/DL-based approaches for CS, under the circumstance of small training data. Instead of directly fitting the small data, we extract prior knowledge which is shared across multiple CS tasks via learning a meta model. Each CS task is a graph with several queries that possess corresponding partial ground-truth. The meta model can be swiftly adapted to a task to be predicted by feeding a few task-specific training data. We find that trivially applying multiple classical meta-learning algorithms to CS suffers from problems regarding prediction effectiveness, generalization capability and efficiency. To address such problems, we propose a novel meta-learning based framework, Conditional Graph Neural Process (CGNP), to fulfill the prior extraction and adaptation procedure. A meta CGNP model is a task-common node embedding function for clustering, learned by metric-based graph learning, which fully exploits the characteristics of CS. We compare CGNP with CS algorithms and ML baselines on real graphs with ground-truth communities. Our experiments verify that CGNP outperforms the other native graph algorithms and ML/DL baselines 0.33 and 0.26 on F1 score by average.

源语言	英语
主期刊名	Proceedings - 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023
出版商	IEEE Computer Society
页	2358-2371
页数	14
ISBN（电子版）	9798350322279
DOI	https://doi.org/10.1109/ICDE55515.2023.00182
出版状态	已出版 - 2023
活动	39th IEEE International Conference on Data Engineering, ICDE 2023 - Anaheim, 美国期限: 3 4月 2023 → 7 4月 2023

出版系列

姓名	Proceedings - International Conference on Data Engineering
卷	2023-April
ISSN（印刷版）	1084-4627

会议

会议	39th IEEE International Conference on Data Engineering, ICDE 2023
国家/地区	美国
市	Anaheim
时期	3/04/23 → 7/04/23

访问文件

10.1109/ICDE55515.2023.00182

其它文件与链接

链接到 Scopus 的出版物

引用此

Fang, S., Zhao, K., Li, G., & Xu Yu, J. (2023). Community Search: A Meta-Learning Approach. 在 Proceedings - 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023 (页码 2358-2371). (Proceedings - International Conference on Data Engineering; 卷 2023-April). IEEE Computer Society. https://doi.org/10.1109/ICDE55515.2023.00182

@inproceedings{b5d20dad4365452bb0e0dec461785047,

title = "Community Search: A Meta-Learning Approach",

abstract = "Community Search (CS) is one of the fundamental graph analysis tasks, which is a building block of various real applications. Given any query nodes, CS aims to find cohesive subgraphs that query nodes belong to. Recently, a large number of CS algorithms are designed. These algorithms adopt predefined subgraph patterns to model the communities, which cannot find ground-truth communities that do not have such pre-defined patterns in real-world graphs. Thereby, machine learning (ML) and deep learning (DL) based approaches are proposed to capture flexible community structures by learning from ground-truth communities in a data-driven fashion. These approaches rely on sufficient training data to provide enough generalization for ML models, however, the ground-truth cannot be comprehensively collected beforehand.In this paper, we study ML/DL-based approaches for CS, under the circumstance of small training data. Instead of directly fitting the small data, we extract prior knowledge which is shared across multiple CS tasks via learning a meta model. Each CS task is a graph with several queries that possess corresponding partial ground-truth. The meta model can be swiftly adapted to a task to be predicted by feeding a few task-specific training data. We find that trivially applying multiple classical meta-learning algorithms to CS suffers from problems regarding prediction effectiveness, generalization capability and efficiency. To address such problems, we propose a novel meta-learning based framework, Conditional Graph Neural Process (CGNP), to fulfill the prior extraction and adaptation procedure. A meta CGNP model is a task-common node embedding function for clustering, learned by metric-based graph learning, which fully exploits the characteristics of CS. We compare CGNP with CS algorithms and ML baselines on real graphs with ground-truth communities. Our experiments verify that CGNP outperforms the other native graph algorithms and ML/DL baselines 0.33 and 0.26 on F1 score by average.",

keywords = "Community search, Meta-learning, Neural process",

author = "Shuheng Fang and Kangfei Zhao and Guanghua Li and {Xu Yu}, Jeffrey",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 39th IEEE International Conference on Data Engineering, ICDE 2023 ; Conference date: 03-04-2023 Through 07-04-2023",

year = "2023",

doi = "10.1109/ICDE55515.2023.00182",

language = "English",

series = "Proceedings - International Conference on Data Engineering",

publisher = "IEEE Computer Society",

pages = "2358--2371",

booktitle = "Proceedings - 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023",

address = "United States",

}

Fang, S, Zhao, K, Li, G & Xu Yu, J 2023, Community Search: A Meta-Learning Approach. 在 Proceedings - 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023. Proceedings - International Conference on Data Engineering, 卷 2023-April, IEEE Computer Society, 页码 2358-2371, 39th IEEE International Conference on Data Engineering, ICDE 2023, Anaheim, 美国, 3/04/23. https://doi.org/10.1109/ICDE55515.2023.00182

Community Search: A Meta-Learning Approach. / Fang, Shuheng; Zhao, Kangfei; Li, Guanghua 等.
Proceedings - 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023. IEEE Computer Society, 2023. 页码 2358-2371 (Proceedings - International Conference on Data Engineering; 卷 2023-April).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Community Search

T2 - 39th IEEE International Conference on Data Engineering, ICDE 2023

AU - Fang, Shuheng

AU - Zhao, Kangfei

AU - Li, Guanghua

AU - Xu Yu, Jeffrey

PY - 2023

Y1 - 2023

N2 - Community Search (CS) is one of the fundamental graph analysis tasks, which is a building block of various real applications. Given any query nodes, CS aims to find cohesive subgraphs that query nodes belong to. Recently, a large number of CS algorithms are designed. These algorithms adopt predefined subgraph patterns to model the communities, which cannot find ground-truth communities that do not have such pre-defined patterns in real-world graphs. Thereby, machine learning (ML) and deep learning (DL) based approaches are proposed to capture flexible community structures by learning from ground-truth communities in a data-driven fashion. These approaches rely on sufficient training data to provide enough generalization for ML models, however, the ground-truth cannot be comprehensively collected beforehand.In this paper, we study ML/DL-based approaches for CS, under the circumstance of small training data. Instead of directly fitting the small data, we extract prior knowledge which is shared across multiple CS tasks via learning a meta model. Each CS task is a graph with several queries that possess corresponding partial ground-truth. The meta model can be swiftly adapted to a task to be predicted by feeding a few task-specific training data. We find that trivially applying multiple classical meta-learning algorithms to CS suffers from problems regarding prediction effectiveness, generalization capability and efficiency. To address such problems, we propose a novel meta-learning based framework, Conditional Graph Neural Process (CGNP), to fulfill the prior extraction and adaptation procedure. A meta CGNP model is a task-common node embedding function for clustering, learned by metric-based graph learning, which fully exploits the characteristics of CS. We compare CGNP with CS algorithms and ML baselines on real graphs with ground-truth communities. Our experiments verify that CGNP outperforms the other native graph algorithms and ML/DL baselines 0.33 and 0.26 on F1 score by average.

AB - Community Search (CS) is one of the fundamental graph analysis tasks, which is a building block of various real applications. Given any query nodes, CS aims to find cohesive subgraphs that query nodes belong to. Recently, a large number of CS algorithms are designed. These algorithms adopt predefined subgraph patterns to model the communities, which cannot find ground-truth communities that do not have such pre-defined patterns in real-world graphs. Thereby, machine learning (ML) and deep learning (DL) based approaches are proposed to capture flexible community structures by learning from ground-truth communities in a data-driven fashion. These approaches rely on sufficient training data to provide enough generalization for ML models, however, the ground-truth cannot be comprehensively collected beforehand.In this paper, we study ML/DL-based approaches for CS, under the circumstance of small training data. Instead of directly fitting the small data, we extract prior knowledge which is shared across multiple CS tasks via learning a meta model. Each CS task is a graph with several queries that possess corresponding partial ground-truth. The meta model can be swiftly adapted to a task to be predicted by feeding a few task-specific training data. We find that trivially applying multiple classical meta-learning algorithms to CS suffers from problems regarding prediction effectiveness, generalization capability and efficiency. To address such problems, we propose a novel meta-learning based framework, Conditional Graph Neural Process (CGNP), to fulfill the prior extraction and adaptation procedure. A meta CGNP model is a task-common node embedding function for clustering, learned by metric-based graph learning, which fully exploits the characteristics of CS. We compare CGNP with CS algorithms and ML baselines on real graphs with ground-truth communities. Our experiments verify that CGNP outperforms the other native graph algorithms and ML/DL baselines 0.33 and 0.26 on F1 score by average.

KW - Community search

KW - Meta-learning

KW - Neural process

UR - http://www.scopus.com/inward/record.url?scp=85167660297&partnerID=8YFLogxK

U2 - 10.1109/ICDE55515.2023.00182

DO - 10.1109/ICDE55515.2023.00182

M3 - Conference contribution

AN - SCOPUS:85167660297

T3 - Proceedings - International Conference on Data Engineering

SP - 2358

EP - 2371

BT - Proceedings - 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023

PB - IEEE Computer Society

Y2 - 3 April 2023 through 7 April 2023

ER -

Community Search: A Meta-Learning Approach

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此