Extract Then Adjust: A Two-Stage Approach for Automatic Term Extraction

Jiangyu Wang, Chong Feng*, Fang Liu, Xinyan Li, Xiaomei Wang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

Automatic Term Extraction (ATE) is a fundamental natural language processing task that extracts relevant terms from domain-specific texts. Existing transformer-based approaches have indeed achieved impressive improvement. However, we observe that even state-of-the-art (SOTA) extractors suffer from boundary errors, which are distinguished by incorrect start or end positions of a candidate term. The minor differences between candidate terms and ground-truth leads to a noticeable performance decline. To alleviate the boundary errors, we propose a two-stage extraction approach. First, we design a span-based extractor to provide high-quality candidate terms. Subsequently, we adjust the boundaries of these candidate terms to enhance performance. Experiment results show that our approach effectively identifies and corrects boundary errors in candidate terms, thereby exceeding the performance of previous state-of-the-art models.

源语言英语
主期刊名Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Proceedings
编辑Fei Liu, Nan Duan, Qingting Xu, Yu Hong
出版商Springer Science and Business Media Deutschland GmbH
236-247
页数12
ISBN(印刷版)9783031446955
DOI
出版状态已出版 - 2023
活动12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023 - Foshan, 中国
期限: 12 10月 202315 10月 2023

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
14303 LNAI
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023
国家/地区中国
Foshan
时期12/10/2315/10/23

指纹

探究 'Extract Then Adjust: A Two-Stage Approach for Automatic Term Extraction' 的科研主题。它们共同构成独一无二的指纹。

引用此