基于短语向量和主题加权的关键词抽取方法

Translated title of the contribution: The Theme-Weighted Keyphrase Extraction Algorithm Based on Phrase Embedding

Xin Sun, Chen Ge, Chang Hong Shen, Ying Jie Zhang

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Keyword extraction is a key basic problem in the field of natural language processing. The keyphrase extraction algorithms(PhraseVecRank) is proposed based on phrase embedding. Firstly, a phrase vector construction model based on LSTM(Long Short-Term Memory) and CNN(Convolutional Neural Network) is designed to solve the semantic representation of complex phrases. Then, PhraseVecRank uses phrase embedding to calculate theme weight for each candidate phrase, and uses semantic similarity between candidate phrase embedding and co-occurrence information to calculate edge weight together, which can improve the extraction effect of keyphrases through topic weighted ranking. The experimental results verify that PhraseVecRank can effectively extract keyphrases covering the topic information of text, and the phrase embedding models we proposed can better represent the semantic information of phrases.

Translated title of the contributionThe Theme-Weighted Keyphrase Extraction Algorithm Based on Phrase Embedding
Original languageChinese (Traditional)
Pages (from-to)1682-1690
Number of pages9
JournalTien Tzu Hsueh Pao/Acta Electronica Sinica
Volume49
Issue number9
DOIs
Publication statusPublished - Sept 2021

Fingerprint

Dive into the research topics of 'The Theme-Weighted Keyphrase Extraction Algorithm Based on Phrase Embedding'. Together they form a unique fingerprint.

Cite this