Extract Then Adjust: A Two-Stage Approach for Automatic Term Extraction

Jiangyu Wang, Chong Feng*, Fang Liu, Xinyan Li, Xiaomei Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Automatic Term Extraction (ATE) is a fundamental natural language processing task that extracts relevant terms from domain-specific texts. Existing transformer-based approaches have indeed achieved impressive improvement. However, we observe that even state-of-the-art (SOTA) extractors suffer from boundary errors, which are distinguished by incorrect start or end positions of a candidate term. The minor differences between candidate terms and ground-truth leads to a noticeable performance decline. To alleviate the boundary errors, we propose a two-stage extraction approach. First, we design a span-based extractor to provide high-quality candidate terms. Subsequently, we adjust the boundaries of these candidate terms to enhance performance. Experiment results show that our approach effectively identifies and corrects boundary errors in candidate terms, thereby exceeding the performance of previous state-of-the-art models.

Original languageEnglish
Title of host publicationNatural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Proceedings
EditorsFei Liu, Nan Duan, Qingting Xu, Yu Hong
PublisherSpringer Science and Business Media Deutschland GmbH
Pages236-247
Number of pages12
ISBN (Print)9783031446955
DOIs
Publication statusPublished - 2023
Event12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023 - Foshan, China
Duration: 12 Oct 202315 Oct 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14303 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023
Country/TerritoryChina
CityFoshan
Period12/10/2315/10/23

Keywords

  • automatic term extraction
  • boundary adjust
  • span extraction

Fingerprint

Dive into the research topics of 'Extract Then Adjust: A Two-Stage Approach for Automatic Term Extraction'. Together they form a unique fingerprint.

Cite this