A Span-Based Distantly Supervised NER with Self-learning

Hongli Mao, Hanlin Tang, Wen Zhang, Heyan Huang, Xian Ling Mao*

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

2 引用 (Scopus)

摘要

The lack of labeled data is one of the major obstacles for named entity recognition (NER). Distant supervision is often used to alleviate this problem, which automatically generates annotated training datasets by dictionaries. However, as far as we know, existing distant supervision based methods do not consider the latent entities which are not in dictionaries. Intuitively, entities of the same type have the similar contextualized feature, we can use the feature to extract the latent entities within corpuses into corresponding dictionaries to improve the performance of distant supervision based methods. Thus, in this paper, we propose a novel span-based self-learning method, which employs span-level features to update corresponding dictionaries. Specifically, the proposed method directly takes all possible spans into account and scores them for each label, then picks latent entities from candidate spans into corresponding dictionaries based on both local and global features. Extensive experiments on two public datasets show that our proposed method performs better than the state-of-the-art baselines.

源语言英语
主期刊名Natural Language Processing and Chinese Computing - 9th CCF International Conference, NLPCC 2020, Proceedings
编辑Xiaodan Zhu, Min Zhang, Yu Hong, Ruifang He
出版商Springer Science and Business Media Deutschland GmbH
192-203
页数12
ISBN(印刷版)9783030604493
DOI
出版状态已出版 - 2020
活动9th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2020 - Zhengzhou, 中国
期限: 14 10月 202018 10月 2020

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
12430 LNAI
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议9th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2020
国家/地区中国
Zhengzhou
时期14/10/2018/10/20

指纹

探究 'A Span-Based Distantly Supervised NER with Self-learning' 的科研主题。它们共同构成独一无二的指纹。

引用此