Predicting citation counts of papers

Junpeng Chen, Chunxia Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

28 Citations (Scopus)

Abstract

The task of citation counts prediction is to predict the citation counts of a paper after a given time period. Future citation counts of papers are an important metric to estimate potential influences of published papers, and will be helpful for researchers to choose representative literatures. This task can be treated as a regression problem. This paper proposes two types of predictive features to represent fundamental characteristics of papers and authors: six content features and ten author features. We introduce the IBM Model 1 to calculate the association probabilities between paper topics which are employed to extract content features, and use the bipartite network projection to obtain the author collaboration network which is utilized to extract author features. Further, we introduce the Gradient Boosted Regression Trees to predict citation counts of papers. Our approach combines contents and topics of papers and multi-dimensional measures of author collaborations in one learning process. Experimental results on the KDD CUP dataset demonstrate that our predicting features and models are effective to solve the problem of citation counts prediction of papers.

Original languageEnglish
Title of host publicationProceedings of 2015 IEEE 14th International Conference on Cognitive Informatics and Cognitive Computing, ICCI*CC 2015
EditorsPhilip Chen, Lotfi A. Zadeh, Ning Ge, Yingxu Wang, Xiaoming Tao, Jianhua Lu, Newton Howard, Bo Zhang
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages434-440
Number of pages7
ISBN (Electronic)9781467372893
DOIs
Publication statusPublished - 11 Sept 2015
Event14th IEEE International Conference on Cognitive Informatics and Cognitive Computing, ICCI*CC 2015 - Beijing, China
Duration: 6 Jul 20158 Jul 2015

Publication series

NameProceedings of 2015 IEEE 14th International Conference on Cognitive Informatics and Cognitive Computing, ICCI*CC 2015

Conference

Conference14th IEEE International Conference on Cognitive Informatics and Cognitive Computing, ICCI*CC 2015
Country/TerritoryChina
CityBeijing
Period6/07/158/07/15

Keywords

  • Gradient Boosted Regression Trees
  • IBM Model 1
  • bipartite network projection
  • citation counts prediction

Fingerprint

Dive into the research topics of 'Predicting citation counts of papers'. Together they form a unique fingerprint.

Cite this