Context-Aware Semantic Type Identification for Relational Attributes

Yue Ding, Yu He Guo, Wei Lu*, Hai Xiang Li, Mei Hui Zhang, Hui Li, An Qun Pan, Xiao Yong Du

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)
Plum Print visual indicator of research metrics
  • Citations
    • Citation Indexes: 2
  • Captures
    • Readers: 3
see details

Abstract

Identifying semantic types for attributes in relations, known as attribute semantic type (AST) identification, plays an important role in many data analysis tasks, such as data cleaning, schema matching, and keyword search in databases. However, due to a lack of unified naming standards across prevalent information systems (a.k.a. information islands), AST identification still remains as an open problem. To tackle this problem, we propose a context-aware method to figure out the ASTs for relations in this paper. We transform the AST identification into a multi-class classification problem and propose a schema context aware (SCA) model to learn the representation from a collection of relations associated with attribute values and schema context. Based on the learned representation, we predict the AST for a given attribute from an underlying relation, wherein the predicted AST is mapped to one of the labeled ASTs. To improve the performance for AST identification, especially for the case that the predicted semantic types of attributes are not included in the labeled ASTs, we then introduce knowledge base embeddings (a.k.a. KBVec) to enhance the above representation and construct a schema context aware model with knowledge base enhanced (SCA-KB) to get a stable and robust model. Extensive experiments based on real datasets demonstrate that our context-aware method outperforms the state-of-the-art approaches by a large margin, up to 6.14% and 25.17% in terms of macro average F 1 score, and up to 0.28% and 9.56% in terms of weighted F 1 score over high-quality and low-quality datasets respectively.

Original languageEnglish
Pages (from-to)927-946
Number of pages20
JournalJournal of Computer Science and Technology
Volume38
Issue number4
DOIs
Publication statusPublished - Jul 2023

Keywords

  • attribute semantic type (AST) identification
  • context-aware
  • knowledge base embedding
  • semantic embedding

Fingerprint

Dive into the research topics of 'Context-Aware Semantic Type Identification for Relational Attributes'. Together they form a unique fingerprint.

Cite this

Ding, Y., Guo, Y. H., Lu, W., Li, H. X., Zhang, M. H., Li, H., Pan, A. Q., & Du, X. Y. (2023). Context-Aware Semantic Type Identification for Relational Attributes. Journal of Computer Science and Technology, 38(4), 927-946. https://doi.org/10.1007/s11390-021-1048-y