TY - JOUR
T1 - An entity linking method for microblog based on semantic categorization by word embeddings
AU - Feng, Chong
AU - Shi, Ge
AU - Guo, Yu Hang
AU - Gong, Jing
AU - Huang, He Yan
N1 - Publisher Copyright:
Copyright © 2016 Acta Automatica Sinica. All rights reserved.
PY - 2016/6/1
Y1 - 2016/6/1
N2 - As a widely applied task in natural language processing (NLP), named entity linking (NEL) is to link a given mention to an unambiguous entity in knowledge base. NEL plays an important role in information extraction and question answering. Since contents of microblog are short, traditional algorithms for long texts linking do not fit the microblog linking task well. Precious studies mostly constructed models based on mentions and its context to disambiguate entities, which are difficult to identify candidates with similar lexical and syntactic features. In this paper, we propose a novel NEL method based on semantic categorization through abstracting in terms of word embeddings, which can make full use of semantic involved in mentions and candidates. Initially, we get the word embeddings through neural network and cluster the entities as features. Then, the candidates are disambiguated through predicting the categories of entities by multiple classifiers. Lastly, we test the method on dataset of NLPCC2014, and draw the conclusion that the proposed method gets a better result than the best known work, especially on accurancy.
AB - As a widely applied task in natural language processing (NLP), named entity linking (NEL) is to link a given mention to an unambiguous entity in knowledge base. NEL plays an important role in information extraction and question answering. Since contents of microblog are short, traditional algorithms for long texts linking do not fit the microblog linking task well. Precious studies mostly constructed models based on mentions and its context to disambiguate entities, which are difficult to identify candidates with similar lexical and syntactic features. In this paper, we propose a novel NEL method based on semantic categorization through abstracting in terms of word embeddings, which can make full use of semantic involved in mentions and candidates. Initially, we get the word embeddings through neural network and cluster the entities as features. Then, the candidates are disambiguated through predicting the categories of entities by multiple classifiers. Lastly, we test the method on dataset of NLPCC2014, and draw the conclusion that the proposed method gets a better result than the best known work, especially on accurancy.
KW - Entity linking
KW - Multiple classifiers
KW - Neural network
KW - Social media processing
KW - Word embedding
UR - http://www.scopus.com/inward/record.url?scp=84977266604&partnerID=8YFLogxK
U2 - 10.16383/j.aas.2016.c150715
DO - 10.16383/j.aas.2016.c150715
M3 - Article
AN - SCOPUS:84977266604
SN - 0254-4156
VL - 42
SP - 915
EP - 922
JO - Zidonghua Xuebao/Acta Automatica Sinica
JF - Zidonghua Xuebao/Acta Automatica Sinica
IS - 6
ER -