TY - GEN
T1 - Knowledge base error detection with relation sensitive embedding
AU - Kim, San
AU - Li, Xiuxing
AU - Li, Kaiyu
AU - Feng, Jianhua
AU - Huang, Yan
AU - Yang, Songfan
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2019.
PY - 2019
Y1 - 2019
N2 - Recently, knowledge bases (KBs) have become more and more essential and helpful data source for various applications and researches. Although modern KBs have included thousands of millions of facts, they still suffer from incompleteness compared with the total amount of facts in real world. Furthermore, a lot of inaccurate and outdated facts may be contained in the KBs. Although there have been many studies dealing with incompleteness of the KBs, very few of works have taken into account detecting the errors in the KBs. Broadly speaking, there are three main challenges in detecting errors in the KBs. (1) Symbolic and logical form of the knowledge representations cannot detect the inconsistencies very well on large scale KBs. (2) It is hard to capture the correlations between relations. (3) There is no golden standard to learn or observe the patterns of inaccurate facts. In this work, we propose a Relation Sensitive Embedding Approach (RSEA) to detect the inconsistencies from KBs. We first design two correlation functions to measure the relatedness between two relations. Then, a dynamic cluster algorithm is presented to aggregate highly correlated relations into the same clusters. Finally, we encode discrete knowledge facts with effects of correlated relations into continuous vector space, which can effectively detect errors in KBs. We perform extensive experiments on two benchmark datasets, and the results show that our approach achieves high performance in detecting incorrect knowledge facts in these KBs.
AB - Recently, knowledge bases (KBs) have become more and more essential and helpful data source for various applications and researches. Although modern KBs have included thousands of millions of facts, they still suffer from incompleteness compared with the total amount of facts in real world. Furthermore, a lot of inaccurate and outdated facts may be contained in the KBs. Although there have been many studies dealing with incompleteness of the KBs, very few of works have taken into account detecting the errors in the KBs. Broadly speaking, there are three main challenges in detecting errors in the KBs. (1) Symbolic and logical form of the knowledge representations cannot detect the inconsistencies very well on large scale KBs. (2) It is hard to capture the correlations between relations. (3) There is no golden standard to learn or observe the patterns of inaccurate facts. In this work, we propose a Relation Sensitive Embedding Approach (RSEA) to detect the inconsistencies from KBs. We first design two correlation functions to measure the relatedness between two relations. Then, a dynamic cluster algorithm is presented to aggregate highly correlated relations into the same clusters. Finally, we encode discrete knowledge facts with effects of correlated relations into continuous vector space, which can effectively detect errors in KBs. We perform extensive experiments on two benchmark datasets, and the results show that our approach achieves high performance in detecting incorrect knowledge facts in these KBs.
KW - Embedding model
KW - Error detection
KW - Knowledge base
UR - https://www.scopus.com/pages/publications/85065529883
U2 - 10.1007/978-3-030-18576-3_43
DO - 10.1007/978-3-030-18576-3_43
M3 - Conference contribution
AN - SCOPUS:85065529883
SN - 9783030185756
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 725
EP - 741
BT - Database Systems for Advanced Applications - 24th International Conference, DASFAA 2019, Proceedings
A2 - Gama, Joao
A2 - Li, Guoliang
A2 - Yang, Jun
A2 - Tong, Yongxin
A2 - Natwichai, Juggapong
PB - Springer Verlag
T2 - 24th International Conference on Database Systems for Advanced Applications, DASFAA 2019
Y2 - 22 April 2019 through 25 April 2019
ER -