A Knowledge-Enriched Ensemble Method for Word Embedding and Multi-Sense Embedding

Lanting Fang; Yong Luo; Kaiyu Feng; Kaiqi Zhao; Aiqun Hu

doi:10.1109/TKDE.2022.3159539

A Knowledge-Enriched Ensemble Method for Word Embedding and Multi-Sense Embedding

Lanting Fang^*, Yong Luo, Kaiyu Feng^*, Kaiqi Zhao, Aiqun Hu

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

7 Citations (Scopus)

Abstract

Representing words as embeddings has been proven to be successful in improving the performance in many natural language processing tasks. Different from the traditional methods that learn the embeddings from large text corpora, ensemble methods have been proposed to leverage the merits of pre-trained word embeddings as well as external semantic sources. In this paper, we propose a knowledge-enriched ensemble method to combine information from both knowledge graphs and pre-trained word embeddings. Specifically, we propose an attention network to retrofit the semantic information in the lexical knowledge graph into the pre-trained word embeddings. In addition, we further extend our method to contextual word embeddings and multi-sense embeddings. Extensive experiments demonstrate that the proposed word embeddings outperform the state-of-the-art models in word analogy, word similarity and several downstream tasks. The proposed word sense embeddings outperform the state-of-the-art models in word similarity and word sense induction tasks.

Original language	English
Pages (from-to)	5534-5549
Number of pages	16
Journal	IEEE Transactions on Knowledge and Data Engineering
Volume	35
Issue number	6
DOIs	https://doi.org/10.1109/TKDE.2022.3159539
Publication status	Published - 1 Jun 2023

Keywords

Word embedding
ensemble model
knowledge graph
multi-sense embedding

Access to Document

10.1109/TKDE.2022.3159539

Cite this

@article{e53b5b4e7dd944959ebbc14fa166d227,

title = "A Knowledge-Enriched Ensemble Method for Word Embedding and Multi-Sense Embedding",

abstract = "Representing words as embeddings has been proven to be successful in improving the performance in many natural language processing tasks. Different from the traditional methods that learn the embeddings from large text corpora, ensemble methods have been proposed to leverage the merits of pre-trained word embeddings as well as external semantic sources. In this paper, we propose a knowledge-enriched ensemble method to combine information from both knowledge graphs and pre-trained word embeddings. Specifically, we propose an attention network to retrofit the semantic information in the lexical knowledge graph into the pre-trained word embeddings. In addition, we further extend our method to contextual word embeddings and multi-sense embeddings. Extensive experiments demonstrate that the proposed word embeddings outperform the state-of-the-art models in word analogy, word similarity and several downstream tasks. The proposed word sense embeddings outperform the state-of-the-art models in word similarity and word sense induction tasks.",

keywords = "Word embedding, ensemble model, knowledge graph, multi-sense embedding",

author = "Lanting Fang and Yong Luo and Kaiyu Feng and Kaiqi Zhao and Aiqun Hu",

note = "Publisher Copyright: {\textcopyright} 1989-2012 IEEE.",

year = "2023",

month = jun,

day = "1",

doi = "10.1109/TKDE.2022.3159539",

language = "English",

volume = "35",

pages = "5534--5549",

journal = "IEEE Transactions on Knowledge and Data Engineering",

issn = "1041-4347",

publisher = "IEEE Computer Society",

number = "6",

}

TY - JOUR

T1 - A Knowledge-Enriched Ensemble Method for Word Embedding and Multi-Sense Embedding

AU - Fang, Lanting

AU - Luo, Yong

AU - Feng, Kaiyu

AU - Zhao, Kaiqi

AU - Hu, Aiqun

PY - 2023/6/1

Y1 - 2023/6/1

N2 - Representing words as embeddings has been proven to be successful in improving the performance in many natural language processing tasks. Different from the traditional methods that learn the embeddings from large text corpora, ensemble methods have been proposed to leverage the merits of pre-trained word embeddings as well as external semantic sources. In this paper, we propose a knowledge-enriched ensemble method to combine information from both knowledge graphs and pre-trained word embeddings. Specifically, we propose an attention network to retrofit the semantic information in the lexical knowledge graph into the pre-trained word embeddings. In addition, we further extend our method to contextual word embeddings and multi-sense embeddings. Extensive experiments demonstrate that the proposed word embeddings outperform the state-of-the-art models in word analogy, word similarity and several downstream tasks. The proposed word sense embeddings outperform the state-of-the-art models in word similarity and word sense induction tasks.

AB - Representing words as embeddings has been proven to be successful in improving the performance in many natural language processing tasks. Different from the traditional methods that learn the embeddings from large text corpora, ensemble methods have been proposed to leverage the merits of pre-trained word embeddings as well as external semantic sources. In this paper, we propose a knowledge-enriched ensemble method to combine information from both knowledge graphs and pre-trained word embeddings. Specifically, we propose an attention network to retrofit the semantic information in the lexical knowledge graph into the pre-trained word embeddings. In addition, we further extend our method to contextual word embeddings and multi-sense embeddings. Extensive experiments demonstrate that the proposed word embeddings outperform the state-of-the-art models in word analogy, word similarity and several downstream tasks. The proposed word sense embeddings outperform the state-of-the-art models in word similarity and word sense induction tasks.

KW - Word embedding

KW - ensemble model

KW - knowledge graph

KW - multi-sense embedding

UR - http://www.scopus.com/inward/record.url?scp=85126546671&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2022.3159539

DO - 10.1109/TKDE.2022.3159539

M3 - Article

AN - SCOPUS:85126546671

SN - 1041-4347

VL - 35

SP - 5534

EP - 5549

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

IS - 6

ER -

A Knowledge-Enriched Ensemble Method for Word Embedding and Multi-Sense Embedding

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this