A Knowledge-Enriched Ensemble Method for Word Embedding and Multi-Sense Embedding

Lanting Fang; Yong Luo; Kaiyu Feng; Kaiqi Zhao; Aiqun Hu

doi:10.1109/TKDE.2022.3159539

A Knowledge-Enriched Ensemble Method for Word Embedding and Multi-Sense Embedding

Lanting Fang^*, Yong Luo, Kaiyu Feng^*, Kaiqi Zhao, Aiqun Hu

^*此作品的通讯作者

计算机学院

科研成果: 期刊稿件 › 文章 › 同行评审

8 引用（Scopus）

摘要

Representing words as embeddings has been proven to be successful in improving the performance in many natural language processing tasks. Different from the traditional methods that learn the embeddings from large text corpora, ensemble methods have been proposed to leverage the merits of pre-trained word embeddings as well as external semantic sources. In this paper, we propose a knowledge-enriched ensemble method to combine information from both knowledge graphs and pre-trained word embeddings. Specifically, we propose an attention network to retrofit the semantic information in the lexical knowledge graph into the pre-trained word embeddings. In addition, we further extend our method to contextual word embeddings and multi-sense embeddings. Extensive experiments demonstrate that the proposed word embeddings outperform the state-of-the-art models in word analogy, word similarity and several downstream tasks. The proposed word sense embeddings outperform the state-of-the-art models in word similarity and word sense induction tasks.

源语言	英语
页（从-至）	5534-5549
页数	16
期刊	IEEE Transactions on Knowledge and Data Engineering
卷	35
期	6
DOI	https://doi.org/10.1109/TKDE.2022.3159539
出版状态	已出版 - 1 6月 2023

访问文件

10.1109/TKDE.2022.3159539

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{e53b5b4e7dd944959ebbc14fa166d227,

title = "A Knowledge-Enriched Ensemble Method for Word Embedding and Multi-Sense Embedding",

abstract = "Representing words as embeddings has been proven to be successful in improving the performance in many natural language processing tasks. Different from the traditional methods that learn the embeddings from large text corpora, ensemble methods have been proposed to leverage the merits of pre-trained word embeddings as well as external semantic sources. In this paper, we propose a knowledge-enriched ensemble method to combine information from both knowledge graphs and pre-trained word embeddings. Specifically, we propose an attention network to retrofit the semantic information in the lexical knowledge graph into the pre-trained word embeddings. In addition, we further extend our method to contextual word embeddings and multi-sense embeddings. Extensive experiments demonstrate that the proposed word embeddings outperform the state-of-the-art models in word analogy, word similarity and several downstream tasks. The proposed word sense embeddings outperform the state-of-the-art models in word similarity and word sense induction tasks.",

keywords = "Word embedding, ensemble model, knowledge graph, multi-sense embedding",

author = "Lanting Fang and Yong Luo and Kaiyu Feng and Kaiqi Zhao and Aiqun Hu",

note = "Publisher Copyright: {\textcopyright} 1989-2012 IEEE.",

year = "2023",

month = jun,

day = "1",

doi = "10.1109/TKDE.2022.3159539",

language = "English",

volume = "35",

pages = "5534--5549",

journal = "IEEE Transactions on Knowledge and Data Engineering",

issn = "1041-4347",

publisher = "IEEE Computer Society",

number = "6",

}

TY - JOUR

T1 - A Knowledge-Enriched Ensemble Method for Word Embedding and Multi-Sense Embedding

AU - Fang, Lanting

AU - Luo, Yong

AU - Feng, Kaiyu

AU - Zhao, Kaiqi

AU - Hu, Aiqun

PY - 2023/6/1

Y1 - 2023/6/1

N2 - Representing words as embeddings has been proven to be successful in improving the performance in many natural language processing tasks. Different from the traditional methods that learn the embeddings from large text corpora, ensemble methods have been proposed to leverage the merits of pre-trained word embeddings as well as external semantic sources. In this paper, we propose a knowledge-enriched ensemble method to combine information from both knowledge graphs and pre-trained word embeddings. Specifically, we propose an attention network to retrofit the semantic information in the lexical knowledge graph into the pre-trained word embeddings. In addition, we further extend our method to contextual word embeddings and multi-sense embeddings. Extensive experiments demonstrate that the proposed word embeddings outperform the state-of-the-art models in word analogy, word similarity and several downstream tasks. The proposed word sense embeddings outperform the state-of-the-art models in word similarity and word sense induction tasks.

AB - Representing words as embeddings has been proven to be successful in improving the performance in many natural language processing tasks. Different from the traditional methods that learn the embeddings from large text corpora, ensemble methods have been proposed to leverage the merits of pre-trained word embeddings as well as external semantic sources. In this paper, we propose a knowledge-enriched ensemble method to combine information from both knowledge graphs and pre-trained word embeddings. Specifically, we propose an attention network to retrofit the semantic information in the lexical knowledge graph into the pre-trained word embeddings. In addition, we further extend our method to contextual word embeddings and multi-sense embeddings. Extensive experiments demonstrate that the proposed word embeddings outperform the state-of-the-art models in word analogy, word similarity and several downstream tasks. The proposed word sense embeddings outperform the state-of-the-art models in word similarity and word sense induction tasks.

KW - Word embedding

KW - ensemble model

KW - knowledge graph

KW - multi-sense embedding

UR - http://www.scopus.com/inward/record.url?scp=85126546671&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2022.3159539

DO - 10.1109/TKDE.2022.3159539

M3 - Article

AN - SCOPUS:85126546671

SN - 1041-4347

VL - 35

SP - 5534

EP - 5549

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

IS - 6

ER -

A Knowledge-Enriched Ensemble Method for Word Embedding and Multi-Sense Embedding

摘要

访问文件

其它文件与链接

指纹

引用此