Domain-specific meta-embedding with latent semantic structures

Qian Liu, Jie Lu*, Guangquan Zhang, Tao Shen, Zhihan Zhang, Heyan Huang

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

19 引用 (Scopus)

摘要

Meta-embedding aims at assembling pre-trained embeddings from various sources and producing more expressively powerful word representations. Many natural language processing (NLP) tasks in a specific domain benefit from meta-embedding, especially when the task suffers from low resources. This paper proposes an unsupervised meta-embedding method that jointly models background knowledge from the source embeddings and domain-specific knowledge from the task domain. Specifically, embeddings from multiple sources for a word are dynamically aggregated to a single meta-embedding by a differentiable attention module. The embeddings derived from pre-training on a large-scale corpus provide complete background knowledge of word usage. Then, the meta-embedding is further enriched by exploring domain-specific knowledge from each task domain in two ways. First, contextual information in the raw corpus is considered to capture the semantics of words. Second, a graph representing domain-specific semantic structures is extracted from the raw corpus to highlight the relationships between salient words, then the graph is modeled by a powerful graph convolution network to effectively capture rich semantic structures among words in the task domain. Experiments conducted on two tasks, i.e., text classification and relation extraction, show that our model outputs more accurate word meta-embeddings for the task domain, compared to other state-of-the-art competitors.

源语言英语
页(从-至)410-423
页数14
期刊Information Sciences
555
DOI
出版状态已出版 - 5月 2021

指纹

探究 'Domain-specific meta-embedding with latent semantic structures' 的科研主题。它们共同构成独一无二的指纹。

引用此