Computing Graph Edit Distance via Neural Graph Matching

Chengzhi Piao; Tingyang Xu; Xiangguo Sun; Yu Rong; Kangfei Zhao; Hong Cheng

doi:10.14778/3594512.3594514

Computing Graph Edit Distance via Neural Graph Matching

Chengzhi Piao, Tingyang Xu, Xiangguo Sun, Yu Rong, Kangfei Zhao, Hong Cheng

Research output: Contribution to journal › Conference article › peer-review

9 Citations (Scopus)

Abstract

Graph edit distance (GED) computation is a fundamental NP-hard problem in graph theory. Given a graph pair (G1,G2), GED is defined as the minimum number of primitive operations convertingG1 to G2. Early studies focus on search-based inexact algorithms such as A*-beam search, and greedy algorithms using bipartite matching due to its NP-hardness. They can obtain a sub-optimal solution by constructing an edit path (the sequence of operations that converts G1 to G2). Recent studies convert the GED between a given graph pair (G1,G2) into a similarity score in the range(0, 1) by a well designed function. Then machine learning models (mostly based on graph neural networks) are applied to predict the similarity score. They achieve a much higher numerical precision than the sub-optimal solutions found by classical algorithms. However, a major limitation is that these machine learning models cannot generate an edit path. They treat the GED computation as a pure regression task to bypass its intrinsic complexity, but ignore the essential task of converting G1 to G2. This severely limits the interpretability and usability of the solution. In this paper, we propose a novel deep learning framework that solves the GED problem in a two-step manner: 1) The proposed graph neural network GEDGNN is in charge of predicting the GED value and a matching matrix; and 2) A post-processing algorithm based on k-best matching is used to derive k possible node matchings from the matching matrix generated by GEDGNN. The best matching will finally lead to a high-quality edit path. Extensive experiments are conducted on three real graph data sets and synthetic power-law graphs to demonstrate the effectiveness of our framework. Compared to the best result of existing GNN-based models, the mean absolute error (MAE) on GED value prediction decreases by 4.9% ∼74.3%. Compared to the state-of-the-art searching algorithm Noah, the MAE on GED value based on edit path reduces by 53.6% ∼88.1%.

Original language	English
Pages (from-to)	1817-1829
Number of pages	13
Journal	Proceedings of the VLDB Endowment
Volume	16
Issue number	8
DOIs	https://doi.org/10.14778/3594512.3594514
Publication status	Published - 2023
Externally published	Yes
Event	49th International Conference on Very Large Data Bases, VLDB 2023 - Vancouver, Canada Duration: 28 Aug 2023 → 1 Sept 2023

Access to Document

10.14778/3594512.3594514

Cite this

Piao, C., Xu, T., Sun, X., Rong, Y., Zhao, K., & Cheng, H. (2023). Computing Graph Edit Distance via Neural Graph Matching. Proceedings of the VLDB Endowment, 16(8), 1817-1829. https://doi.org/10.14778/3594512.3594514

@article{eba733ad5266477aa2abf4a89c291abb,

title = "Computing Graph Edit Distance via Neural Graph Matching",

abstract = "Graph edit distance (GED) computation is a fundamental NP-hard problem in graph theory. Given a graph pair (G1,G2), GED is defined as the minimum number of primitive operations convertingG1 to G2. Early studies focus on search-based inexact algorithms such as A*-beam search, and greedy algorithms using bipartite matching due to its NP-hardness. They can obtain a sub-optimal solution by constructing an edit path (the sequence of operations that converts G1 to G2). Recent studies convert the GED between a given graph pair (G1,G2) into a similarity score in the range(0, 1) by a well designed function. Then machine learning models (mostly based on graph neural networks) are applied to predict the similarity score. They achieve a much higher numerical precision than the sub-optimal solutions found by classical algorithms. However, a major limitation is that these machine learning models cannot generate an edit path. They treat the GED computation as a pure regression task to bypass its intrinsic complexity, but ignore the essential task of converting G1 to G2. This severely limits the interpretability and usability of the solution. In this paper, we propose a novel deep learning framework that solves the GED problem in a two-step manner: 1) The proposed graph neural network GEDGNN is in charge of predicting the GED value and a matching matrix; and 2) A post-processing algorithm based on k-best matching is used to derive k possible node matchings from the matching matrix generated by GEDGNN. The best matching will finally lead to a high-quality edit path. Extensive experiments are conducted on three real graph data sets and synthetic power-law graphs to demonstrate the effectiveness of our framework. Compared to the best result of existing GNN-based models, the mean absolute error (MAE) on GED value prediction decreases by 4.9% ∼74.3%. Compared to the state-of-the-art searching algorithm Noah, the MAE on GED value based on edit path reduces by 53.6% ∼88.1%.",

author = "Chengzhi Piao and Tingyang Xu and Xiangguo Sun and Yu Rong and Kangfei Zhao and Hong Cheng",

note = "Publisher Copyright: {\textcopyright} 2023, VLDB Endowment. All rights reserved.; 49th International Conference on Very Large Data Bases, VLDB 2023 ; Conference date: 28-08-2023 Through 01-09-2023",

year = "2023",

doi = "10.14778/3594512.3594514",

language = "English",

volume = "16",

pages = "1817--1829",

journal = "Proceedings of the VLDB Endowment",

issn = "2150-8097",

publisher = "Very Large Data Base Endowment Inc.",

number = "8",

}

TY - JOUR

T1 - Computing Graph Edit Distance via Neural Graph Matching

AU - Piao, Chengzhi

AU - Xu, Tingyang

AU - Sun, Xiangguo

AU - Rong, Yu

AU - Zhao, Kangfei

AU - Cheng, Hong

PY - 2023

Y1 - 2023

N2 - Graph edit distance (GED) computation is a fundamental NP-hard problem in graph theory. Given a graph pair (G1,G2), GED is defined as the minimum number of primitive operations convertingG1 to G2. Early studies focus on search-based inexact algorithms such as A*-beam search, and greedy algorithms using bipartite matching due to its NP-hardness. They can obtain a sub-optimal solution by constructing an edit path (the sequence of operations that converts G1 to G2). Recent studies convert the GED between a given graph pair (G1,G2) into a similarity score in the range(0, 1) by a well designed function. Then machine learning models (mostly based on graph neural networks) are applied to predict the similarity score. They achieve a much higher numerical precision than the sub-optimal solutions found by classical algorithms. However, a major limitation is that these machine learning models cannot generate an edit path. They treat the GED computation as a pure regression task to bypass its intrinsic complexity, but ignore the essential task of converting G1 to G2. This severely limits the interpretability and usability of the solution. In this paper, we propose a novel deep learning framework that solves the GED problem in a two-step manner: 1) The proposed graph neural network GEDGNN is in charge of predicting the GED value and a matching matrix; and 2) A post-processing algorithm based on k-best matching is used to derive k possible node matchings from the matching matrix generated by GEDGNN. The best matching will finally lead to a high-quality edit path. Extensive experiments are conducted on three real graph data sets and synthetic power-law graphs to demonstrate the effectiveness of our framework. Compared to the best result of existing GNN-based models, the mean absolute error (MAE) on GED value prediction decreases by 4.9% ∼74.3%. Compared to the state-of-the-art searching algorithm Noah, the MAE on GED value based on edit path reduces by 53.6% ∼88.1%.

AB - Graph edit distance (GED) computation is a fundamental NP-hard problem in graph theory. Given a graph pair (G1,G2), GED is defined as the minimum number of primitive operations convertingG1 to G2. Early studies focus on search-based inexact algorithms such as A*-beam search, and greedy algorithms using bipartite matching due to its NP-hardness. They can obtain a sub-optimal solution by constructing an edit path (the sequence of operations that converts G1 to G2). Recent studies convert the GED between a given graph pair (G1,G2) into a similarity score in the range(0, 1) by a well designed function. Then machine learning models (mostly based on graph neural networks) are applied to predict the similarity score. They achieve a much higher numerical precision than the sub-optimal solutions found by classical algorithms. However, a major limitation is that these machine learning models cannot generate an edit path. They treat the GED computation as a pure regression task to bypass its intrinsic complexity, but ignore the essential task of converting G1 to G2. This severely limits the interpretability and usability of the solution. In this paper, we propose a novel deep learning framework that solves the GED problem in a two-step manner: 1) The proposed graph neural network GEDGNN is in charge of predicting the GED value and a matching matrix; and 2) A post-processing algorithm based on k-best matching is used to derive k possible node matchings from the matching matrix generated by GEDGNN. The best matching will finally lead to a high-quality edit path. Extensive experiments are conducted on three real graph data sets and synthetic power-law graphs to demonstrate the effectiveness of our framework. Compared to the best result of existing GNN-based models, the mean absolute error (MAE) on GED value prediction decreases by 4.9% ∼74.3%. Compared to the state-of-the-art searching algorithm Noah, the MAE on GED value based on edit path reduces by 53.6% ∼88.1%.

UR - http://www.scopus.com/inward/record.url?scp=85161646304&partnerID=8YFLogxK

U2 - 10.14778/3594512.3594514

DO - 10.14778/3594512.3594514

M3 - Conference article

AN - SCOPUS:85161646304

SN - 2150-8097

VL - 16

SP - 1817

EP - 1829

JO - Proceedings of the VLDB Endowment

JF - Proceedings of the VLDB Endowment

IS - 8

T2 - 49th International Conference on Very Large Data Bases, VLDB 2023

Y2 - 28 August 2023 through 1 September 2023

ER -

Computing Graph Edit Distance via Neural Graph Matching

Abstract

Access to Document

Other files and links

Fingerprint

Cite this