TY - GEN
T1 - A Span-Based Distantly Supervised NER with Self-learning
AU - Mao, Hongli
AU - Tang, Hanlin
AU - Zhang, Wen
AU - Huang, Heyan
AU - Mao, Xian Ling
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - The lack of labeled data is one of the major obstacles for named entity recognition (NER). Distant supervision is often used to alleviate this problem, which automatically generates annotated training datasets by dictionaries. However, as far as we know, existing distant supervision based methods do not consider the latent entities which are not in dictionaries. Intuitively, entities of the same type have the similar contextualized feature, we can use the feature to extract the latent entities within corpuses into corresponding dictionaries to improve the performance of distant supervision based methods. Thus, in this paper, we propose a novel span-based self-learning method, which employs span-level features to update corresponding dictionaries. Specifically, the proposed method directly takes all possible spans into account and scores them for each label, then picks latent entities from candidate spans into corresponding dictionaries based on both local and global features. Extensive experiments on two public datasets show that our proposed method performs better than the state-of-the-art baselines.
AB - The lack of labeled data is one of the major obstacles for named entity recognition (NER). Distant supervision is often used to alleviate this problem, which automatically generates annotated training datasets by dictionaries. However, as far as we know, existing distant supervision based methods do not consider the latent entities which are not in dictionaries. Intuitively, entities of the same type have the similar contextualized feature, we can use the feature to extract the latent entities within corpuses into corresponding dictionaries to improve the performance of distant supervision based methods. Thus, in this paper, we propose a novel span-based self-learning method, which employs span-level features to update corresponding dictionaries. Specifically, the proposed method directly takes all possible spans into account and scores them for each label, then picks latent entities from candidate spans into corresponding dictionaries based on both local and global features. Extensive experiments on two public datasets show that our proposed method performs better than the state-of-the-art baselines.
KW - Distant supervision
KW - Name entity recognition
KW - Self-learning
KW - Span-level
UR - http://www.scopus.com/inward/record.url?scp=85093106841&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-60450-9_16
DO - 10.1007/978-3-030-60450-9_16
M3 - Conference contribution
AN - SCOPUS:85093106841
SN - 9783030604493
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 192
EP - 203
BT - Natural Language Processing and Chinese Computing - 9th CCF International Conference, NLPCC 2020, Proceedings
A2 - Zhu, Xiaodan
A2 - Zhang, Min
A2 - Hong, Yu
A2 - He, Ruifang
PB - Springer Science and Business Media Deutschland GmbH
T2 - 9th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2020
Y2 - 14 October 2020 through 18 October 2020
ER -