An approach for identifying author profiles of blogs

Chunxia Zhang*, Yu Guo, Jiayu Wu, Shuliang Wang, Zhendong Niu, Wen Cheng

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Author profile identification has been an important research problem in the areas of web mining, network public opinion monitoring and social network analysis. The aim of this problem is to identify characteristics or traits of authors of textual information such as blogs, microblogs or reviews in social network platforms or commercial platforms. The technology of author profile identification can be employed into many applications including cyberspace forensics, electronic commerce and information security. In this paper, we propose a hybrid framework or technique to solve the author profile identification problem. In this framework, we design a distributed integrated representation approach of blogs based on Doc2vec and term frequency-inverse document frequency, and apply the convolutional neural network to predict age, gender and education status of authors of blogs. The benefit of our technique is that it predicts three different traits of authors in a uniform way, is an unsupervised method which can learn representation vectors of blog posts based on unlabeled data, and does not need any syntactic and semantic parsing of sentences. Experimental results on blogs show that our approach achieves a promising performance.

Original languageEnglish
Title of host publicationAdvanced Data Mining and Applications - 13th International Conference, ADMA 2017, Proceedings
EditorsWen-Chih Peng, Wei Emma Zhang, Gao Cong, Aixin Sun, Chengliang Li
PublisherSpringer Verlag
Pages475-487
Number of pages13
ISBN (Print)9783319691787
DOIs
Publication statusPublished - 2017
Event13th International Conference on Advanced Data Mining and Applications, ADMA 2017 - Singapore, Singapore
Duration: 5 Nov 20176 Nov 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10604 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th International Conference on Advanced Data Mining and Applications, ADMA 2017
Country/TerritorySingapore
CitySingapore
Period5/11/176/11/17

Keywords

  • Age prediction
  • Author profile identification
  • Convolutional neural network
  • Doc2vec
  • Education status prediction
  • Gender prediction

Fingerprint

Dive into the research topics of 'An approach for identifying author profiles of blogs'. Together they form a unique fingerprint.

Cite this