Disambiguating author names with embedding heterogeneous information and attentive rnn clustering parameters

Wang Ruolin, Niu Zhendong*, Lin Qika, Zhu Yifan, Qiu Ping, Lu Hao, Liu Donglei

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

3 引用 (Scopus)

摘要

[Objective] This paper proposes a name disambiguation method for scientific literature, aiming to distinguish scholars with the same name. The existing solutions utilizes document feature extraction or relationship between documents and co-authors, which loses higher-order attributes. [Methods] First, we established a unified feature extraction framework of Paper Embedding Network (PaperEmbNet), which combined content and relationship to build an academic heterogeneous information network for each author. Then, we designed a Clustering Parameters Method (AR4CPM) based on the Attentive Recurrent Neural Network to estimate the clustering number directly. Finally, we used the Hierarchical agglomerative clustering algorithm (HAC) to disambiguate author names with the predicted number as the preset parameter. [Results] We examined the proposed model with the AMiner-AND dataset and found the macro-F1 score was up to 4.75% higher than the suboptimal model, and the average training time was 5-10 minutes shorter than the existing baselines. [Limitations] We need to evaluate the performance of the proposed method with multilingual environment. [Conclusions] The proposed approach could effectively conduct the name disambiguation tasks.

源语言英语
页(从-至)13-24
页数12
期刊Data Analysis and Knowledge Discovery
5
8
DOI
出版状态已出版 - 2021

指纹

探究 'Disambiguating author names with embedding heterogeneous information and attentive rnn clustering parameters' 的科研主题。它们共同构成独一无二的指纹。

引用此