基于短语向量和主题加权的关键词抽取方法

Xin Sun, Chen Ge, Chang Hong Shen, Ying Jie Zhang

科研成果: 期刊稿件文章同行评审

2 引用 (Scopus)

摘要

Keyword extraction is a key basic problem in the field of natural language processing. The keyphrase extraction algorithms(PhraseVecRank) is proposed based on phrase embedding. Firstly, a phrase vector construction model based on LSTM(Long Short-Term Memory) and CNN(Convolutional Neural Network) is designed to solve the semantic representation of complex phrases. Then, PhraseVecRank uses phrase embedding to calculate theme weight for each candidate phrase, and uses semantic similarity between candidate phrase embedding and co-occurrence information to calculate edge weight together, which can improve the extraction effect of keyphrases through topic weighted ranking. The experimental results verify that PhraseVecRank can effectively extract keyphrases covering the topic information of text, and the phrase embedding models we proposed can better represent the semantic information of phrases.

投稿的翻译标题The Theme-Weighted Keyphrase Extraction Algorithm Based on Phrase Embedding
源语言繁体中文
页(从-至)1682-1690
页数9
期刊Tien Tzu Hsueh Pao/Acta Electronica Sinica
49
9
DOI
出版状态已出版 - 9月 2021

关键词

  • Auto-encoder
  • Keyphrases extraction
  • Phrase embedding
  • Theme-weighted

指纹

探究 '基于短语向量和主题加权的关键词抽取方法' 的科研主题。它们共同构成独一无二的指纹。

引用此