Improving candidate generation for entity linking

  • Yuhang Guo
  • , Bing Qin*
  • , Yuqin Li
  • , Ting Liu
  • , Sheng Li
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Citations (Scopus)

Abstract

Entity linking is the task of linking names in free text to the referent entities in a knowledge base. Most recently proposed linking systems can be broken down into two steps: candidate generation and candidate ranking. The first step searches candidates from the knowledge base and the second step disambiguates them. Previous works have been focused on the recall of the generation because if the target entity is absent in the candidate set, no ranking method can return the correct result. Most of the recall-driven generation strategies will increase the number of the candidates. However, with large candidate sets, memory/time consuming systems are impractical for online applications. In this paper, we propose a novel candidate generation approach to generate high recall candidate set with small size. Experimental results on two KBP data sets show that the candidate generation recall achieves more than 93%. By leveraging our approach, the candidate number is reduced from hundreds to dozens, the system runtime is saved by 70.3% and 76.6% over the baseline and the highest micro-averaged accuracy in the evaluation is improved by 2.2% and 3.4%.

Original languageEnglish
Title of host publicationNatural Language Processing and Information Systems - 18th International Conference on Applications of Natural Language to Information Systems, NLDB 2013, Proceedings
Pages225-236
Number of pages12
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event18th International Conference on Application of Natural Language to Information Systems, NLDB 2013 - Salford, United Kingdom
Duration: 19 Jun 201321 Jun 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7934 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th International Conference on Application of Natural Language to Information Systems, NLDB 2013
Country/TerritoryUnited Kingdom
CitySalford
Period19/06/1321/06/13

Keywords

  • Candidate Generation
  • Candidate Pruning
  • Entity Linking
  • Information Extraction
  • Natural Language Processing

Fingerprint

Dive into the research topics of 'Improving candidate generation for entity linking'. Together they form a unique fingerprint.

Cite this