Extracting fine-grained entities based on coordinate graph

Qing Yang, Peng Jiang, Chunxia Zhang*, Zhendong Niu

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Most previous entity extraction studies focus on a small set of coarse-grained classes, such as person etc. However, the distribution of entities within query logs of search engine indicates that users are more interested in a wider range of fine-grained entities, such as GRAMMY winner and Ivy League member etc. In this paper, we present a semi-supervised method to extract fine-grained entities from an open-domain corpus. We build a graph based on entities in coordinate lists, which are html nodes with the same tag path of the DOM trees. Then class labels are propagated over the graph from known entities to unknowns. Experiments on a large corpus from ClueWeb09a dataset show that our proposed approach achieves the promising results.

源语言英语
主期刊名Natural Language Processing and Information Systems - 18th International Conference on Applications of Natural Language to Information Systems, NLDB 2013, Proceedings
367-371
页数5
DOI
出版状态已出版 - 2013
活动18th International Conference on Application of Natural Language to Information Systems, NLDB 2013 - Salford, 英国
期限: 19 6月 201321 6月 2013

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
7934 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议18th International Conference on Application of Natural Language to Information Systems, NLDB 2013
国家/地区英国
Salford
时期19/06/1321/06/13

指纹

探究 'Extracting fine-grained entities based on coordinate graph' 的科研主题。它们共同构成独一无二的指纹。

引用此