A Span-Based Distantly Supervised NER with Self-learning

Hongli Mao, Hanlin Tang, Wen Zhang, Heyan Huang, Xian Ling Mao*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

The lack of labeled data is one of the major obstacles for named entity recognition (NER). Distant supervision is often used to alleviate this problem, which automatically generates annotated training datasets by dictionaries. However, as far as we know, existing distant supervision based methods do not consider the latent entities which are not in dictionaries. Intuitively, entities of the same type have the similar contextualized feature, we can use the feature to extract the latent entities within corpuses into corresponding dictionaries to improve the performance of distant supervision based methods. Thus, in this paper, we propose a novel span-based self-learning method, which employs span-level features to update corresponding dictionaries. Specifically, the proposed method directly takes all possible spans into account and scores them for each label, then picks latent entities from candidate spans into corresponding dictionaries based on both local and global features. Extensive experiments on two public datasets show that our proposed method performs better than the state-of-the-art baselines.

Original languageEnglish
Title of host publicationNatural Language Processing and Chinese Computing - 9th CCF International Conference, NLPCC 2020, Proceedings
EditorsXiaodan Zhu, Min Zhang, Yu Hong, Ruifang He
PublisherSpringer Science and Business Media Deutschland GmbH
Pages192-203
Number of pages12
ISBN (Print)9783030604493
DOIs
Publication statusPublished - 2020
Event9th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2020 - Zhengzhou, China
Duration: 14 Oct 202018 Oct 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12430 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference9th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2020
Country/TerritoryChina
CityZhengzhou
Period14/10/2018/10/20

Keywords

  • Distant supervision
  • Name entity recognition
  • Self-learning
  • Span-level

Fingerprint

Dive into the research topics of 'A Span-Based Distantly Supervised NER with Self-learning'. Together they form a unique fingerprint.

Cite this