TY - GEN
T1 - Semantics Driven Embedding Learning for Effective Entity Alignment
AU - Zhong, Ziyue
AU - Zhang, Meihui
AU - Fan, Ju
AU - Dou, Chenxiao
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Knowledge-based data service has become an emerging form of service in the world wide web (WWW). To ensure the service quality, a comprehensive knowledge base has to be constructed. Knowledge base integration is often a primary way to improve the completeness. In this paper, we focus on the fundamental problem in knowledge base integration, i.e., entity alignment (EA). EA has been studied for years. Traditional approaches focus on the symbolic features of entities and propose various similarity measures to identify equivalent entities. With recent development in knowledge graph representation learning, embedding-based entity alignment has emerged, which encodes the entities into vectors according to the semantic or structural information and computes the relatedness of entities based on the vector representation. While embedding-based approaches achieve promising results, we identify some important information that are not well exploited in existing works: 1) The neighboring entities contribute differently in the EA process, and should be carefully assigned the importance in learning the relatedness of entities; 2) The attribute values (especially the long texts) contain rich semantics that can build supplementary associations between entities. To this end, we propose SDEA - a Semantics Driven entity embedding method for Entity Alignment. SDEA consists of two modules, namely attribute embedding and relation embedding. The attribute embedding captures the semantic information from attribute values with a pre-trained transformer-based language model. The relation embedding selectively aggregates the semantic information from neighbors using a GRU model equipped with an attention mechanism. Both attribute embedding and relation embedding are driven by semantics, building bridges between entities. Experimental results show that our method significantly outperforms the state-of-the-art approaches on three benchmarks.
AB - Knowledge-based data service has become an emerging form of service in the world wide web (WWW). To ensure the service quality, a comprehensive knowledge base has to be constructed. Knowledge base integration is often a primary way to improve the completeness. In this paper, we focus on the fundamental problem in knowledge base integration, i.e., entity alignment (EA). EA has been studied for years. Traditional approaches focus on the symbolic features of entities and propose various similarity measures to identify equivalent entities. With recent development in knowledge graph representation learning, embedding-based entity alignment has emerged, which encodes the entities into vectors according to the semantic or structural information and computes the relatedness of entities based on the vector representation. While embedding-based approaches achieve promising results, we identify some important information that are not well exploited in existing works: 1) The neighboring entities contribute differently in the EA process, and should be carefully assigned the importance in learning the relatedness of entities; 2) The attribute values (especially the long texts) contain rich semantics that can build supplementary associations between entities. To this end, we propose SDEA - a Semantics Driven entity embedding method for Entity Alignment. SDEA consists of two modules, namely attribute embedding and relation embedding. The attribute embedding captures the semantic information from attribute values with a pre-trained transformer-based language model. The relation embedding selectively aggregates the semantic information from neighbors using a GRU model equipped with an attention mechanism. Both attribute embedding and relation embedding are driven by semantics, building bridges between entities. Experimental results show that our method significantly outperforms the state-of-the-art approaches on three benchmarks.
KW - Entity Alignment
KW - Knowledge Base Integration
KW - Semantics Driven
KW - Transformer
UR - http://www.scopus.com/inward/record.url?scp=85136449092&partnerID=8YFLogxK
U2 - 10.1109/ICDE53745.2022.00205
DO - 10.1109/ICDE53745.2022.00205
M3 - Conference contribution
AN - SCOPUS:85136449092
T3 - Proceedings - International Conference on Data Engineering
SP - 2127
EP - 2140
BT - Proceedings - 2022 IEEE 38th International Conference on Data Engineering, ICDE 2022
PB - IEEE Computer Society
T2 - 38th IEEE International Conference on Data Engineering, ICDE 2022
Y2 - 9 May 2022 through 12 May 2022
ER -