SNER-CS: Self-training Named Entity Recognition in Computer Science

Jing Jing Zhu, Xian Ling Mao*, Heyan Huang

*此作品的通讯作者

科研成果: 期刊稿件会议文章同行评审

1 引用 (Scopus)

摘要

As the number of scientific publications grows, especially in computer science domain (CS), it is important to extract scientific entities from a large number of CS publications. Distantly supervised methods, generating distantly annotated training data by string match with external dictionary automatically, have been widely used in named entity recognition task. However, there are two challenges to use distantly supervised methods in computer science NER task. One is that more and more new tasks, methods and datasets in CS are proposed rapidly, which makes it difficult to build a computer science entity knowledge base with high coverage. The other is noisy annotation, because there is no uniform entity representation standard in computer science domain. To alleviate the two problems above, we propose a novel self-training method based pretraining language model with a distantly supervised label automatic construction system in CS (SNER-CS). Experimental results show that the proposed model SNER-CS performs previous state-of-the-art methods in computer science NER task.

源语言英语
文章编号012007
期刊Journal of Physics: Conference Series
2506
1
DOI
出版状态已出版 - 2023
活动2022 International Joint Conference on Robotics and Artificial Intelligence, JCRAI 2022 - Virtual, Online
期限: 14 10月 202217 10月 2022

指纹

探究 'SNER-CS: Self-training Named Entity Recognition in Computer Science' 的科研主题。它们共同构成独一无二的指纹。

引用此