TY - GEN
T1 - What Affects the Performance of Models? Sensitivity Analysis of Knowledge Graph Embedding
AU - Yang, Han
AU - Zhang, Leilei
AU - Su, Fenglong
AU - Pang, Jinhui
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Knowledge graph (KG) embedding aims to embed entities and relations into a low-dimensional vector space, which has been an active research topic for knowledge base completion (KGC). Recent researchers improve existing models in terms of knowledge representation space, scoring function, encoding method, etc., have achieved progressive improvements. However, the theoretical mechanism behind them has always been ignored. There are few works on sensitivity analysis of embedded models, which is extremely challenging. The diversity of KGE models makes it difficult to consider them uniformly and compare them fairly. In this paper, we first study the internal connections and mutual transformation methods of different KGE models from the generic group perspective, and further propose a unified KGE learning framework. Then, we conduct an in-depth sensitivity analysis on the factors that affect the objective of embedding learning. Specifically, in addition to the impact of the embedding algorithm itself, this article also considers the structural features of the dataset and the strategies of the training method. After a comprehensive experiment and analysis, we can conclude that the Head-to-Tail rate of datasets, the definition of model metric function, the number of negative samples and the selection of regularization methods have a greater impact on the final performance.
AB - Knowledge graph (KG) embedding aims to embed entities and relations into a low-dimensional vector space, which has been an active research topic for knowledge base completion (KGC). Recent researchers improve existing models in terms of knowledge representation space, scoring function, encoding method, etc., have achieved progressive improvements. However, the theoretical mechanism behind them has always been ignored. There are few works on sensitivity analysis of embedded models, which is extremely challenging. The diversity of KGE models makes it difficult to consider them uniformly and compare them fairly. In this paper, we first study the internal connections and mutual transformation methods of different KGE models from the generic group perspective, and further propose a unified KGE learning framework. Then, we conduct an in-depth sensitivity analysis on the factors that affect the objective of embedding learning. Specifically, in addition to the impact of the embedding algorithm itself, this article also considers the structural features of the dataset and the strategies of the training method. After a comprehensive experiment and analysis, we can conclude that the Head-to-Tail rate of datasets, the definition of model metric function, the number of negative samples and the selection of regularization methods have a greater impact on the final performance.
KW - Group theory
KW - Knowledge graph embedding
KW - Sensitivity analysis
UR - http://www.scopus.com/inward/record.url?scp=85129876695&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-00123-9_55
DO - 10.1007/978-3-031-00123-9_55
M3 - Conference contribution
AN - SCOPUS:85129876695
SN - 9783031001222
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 698
EP - 713
BT - Database Systems for Advanced Applications - 27th International Conference, DASFAA 2022, Proceedings
A2 - Bhattacharya, Arnab
A2 - Lee Mong Li, Janice
A2 - Agrawal, Divyakant
A2 - Reddy, P. Krishna
A2 - Mohania, Mukesh
A2 - Mondal, Anirban
A2 - Goyal, Vikram
A2 - Uday Kiran, Rage
PB - Springer Science and Business Media Deutschland GmbH
T2 - 27th International Conference on Database Systems for Advanced Applications, DASFAA 2022
Y2 - 11 April 2022 through 14 April 2022
ER -