TY - GEN
T1 - Rule-based graph repairing
T2 - 34th IEEE International Conference on Data Engineering, ICDE 2018
AU - Cheng, Yurong
AU - Chen, Lei
AU - Yuan, Ye
AU - Wang, Guoren
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/10/24
Y1 - 2018/10/24
N2 - Real-life graph datasets extracted from Web are inevitably full of incompleteness, conflicts, and redundancies, so graph data cleaning shows its necessity. One of the main issues is to automatically repair the graph with some repairing rules. Although rules like data dependencies have been widely studied in relational data repairing, very few works exist to repair the graph data. In this paper, we introduce an automatic repairing semantic for graphs, called Graph-Repairing Rules (GRRs). This semantic can capture the incompleteness, conflicts, and redundancies in the graphs and indicate how to correct these errors. We study three fundamental problems associated with GRRs, implication, consistency and termination, which show whether a given set of GRRs make sense. Repairing the graph data using GRRs involves a problem of finding isomorphic subgraphs of the graph data for each GRR, which is NP-complete. To efficiently circumvent the complex calculation of subgraph isomorphism, we design a decomposition-And-join strategy to solve this problem. Extensive experiments on real datasets show that our GRR semantic and corresponding repairing algorithms can effectively and efficiently repair real-life graph data.
AB - Real-life graph datasets extracted from Web are inevitably full of incompleteness, conflicts, and redundancies, so graph data cleaning shows its necessity. One of the main issues is to automatically repair the graph with some repairing rules. Although rules like data dependencies have been widely studied in relational data repairing, very few works exist to repair the graph data. In this paper, we introduce an automatic repairing semantic for graphs, called Graph-Repairing Rules (GRRs). This semantic can capture the incompleteness, conflicts, and redundancies in the graphs and indicate how to correct these errors. We study three fundamental problems associated with GRRs, implication, consistency and termination, which show whether a given set of GRRs make sense. Repairing the graph data using GRRs involves a problem of finding isomorphic subgraphs of the graph data for each GRR, which is NP-complete. To efficiently circumvent the complex calculation of subgraph isomorphism, we design a decomposition-And-join strategy to solve this problem. Extensive experiments on real datasets show that our GRR semantic and corresponding repairing algorithms can effectively and efficiently repair real-life graph data.
KW - Graph Repair
KW - Rule
UR - http://www.scopus.com/inward/record.url?scp=85057094152&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2018.00075
DO - 10.1109/ICDE.2018.00075
M3 - Conference contribution
AN - SCOPUS:85057094152
T3 - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
SP - 773
EP - 784
BT - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 16 April 2018 through 19 April 2018
ER -