HCUKE: A Hierarchical Context-aware approach for Unsupervised Keyphrase Extraction

Chun Xu, Xian Ling Mao*, Cheng Xin Xin, Yu Ming Shang, Tian Yi Che, Hong Li Mao, Heyan Huang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Keyphrase Extraction (KE) aims to identify a concise set of words or phrases that effectively summarizes the core ideas of a document. Recent embedding-based models have achieved state-of-the-art performance by jointly modeling local and global contexts in Unsupervised Keyphrase Extraction (UKE). However, these models often ignore either sentence- or document-level contexts, leading directly to weak or incorrect global significance. Furthermore, they rely heavily on local significance, making them vulnerable to noisy data, particularly in long documents, resulting in unstable and suboptimal performance. Intuitively, hierarchical contexts enable a more accurate understanding of the candidates, thereby enhancing their global relevance. Inspired by this, we propose a novel Hierarchical Context-aware Unsupervised Keyphrase Extraction method called HCUKE. Specifically, HCUKE comprises three core modules: (i) a hierarchical context-based global significance measure module that incrementally learns global semantic information from a three-level hierarchical structure; (ii) a phrase-level local significance measure module that captures local semantic information by modeling the context interaction among candidates; and (iii) a candidate ranking module that integrates the measure scores with positional weights to compute a final ranking score. Extensive experiments on three benchmark datasets demonstrate that the proposed method significantly outperforms state-of-the-art baselines.

Original languageEnglish
Article number112511
JournalKnowledge-Based Systems
Volume304
DOIs
Publication statusPublished - 25 Nov 2024

Keywords

  • Contextual embedding
  • Global significance
  • Hierarchical context
  • Unsupervised Keyphrase Extraction

Fingerprint

Dive into the research topics of 'HCUKE: A Hierarchical Context-aware approach for Unsupervised Keyphrase Extraction'. Together they form a unique fingerprint.

Cite this