Skip to main navigation Skip to search Skip to main content

Incorporating Entity Correlation Knowledge into Topic Modeling

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Latent Dirichlet Allocation (LDA) is a popular topic modeling technique for exploring hidden topics in text corpora. Standard LDA model suffers the problem that the topic assignment of each word is independent and lacks the mechanism to utilize the rich prior background knowledge to learn semantically coherent topics. To address this problem, in this paper, we propose a model called Entity Correlation Latent Dirichlet Allocation (EC-LDA) by incorporating constraints derived from entity correlations as the prior knowledge into LDA topic model. Different from other knowledge-based topic models which extract the knowledge information directly from the train dataset itself or even from the human judgements, for our work, we take advantage of the prior knowledge from the external knowledge base (Freebase 1, in our experiment). Hence, our approach is more suitable to widely kinds of text corpora in different scenarios. We fit our proposed model using Gibbs sampling. Experiment results demonstrate the effectiveness of our model compared with standard LDA.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE International Conference on Big Knowledge, ICBK 2017
EditorsXindong Wu, Xindong Wu, Tamer Ozsu, Jim Hendler, Ruqian Lu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages254-258
Number of pages5
ISBN (Electronic)9781538631195
DOIs
Publication statusPublished - 30 Aug 2017
Externally publishedYes
Event8th IEEE International Conference on Big Knowledge, ICBK 2017 - Hefei, China
Duration: 9 Aug 201710 Aug 2017

Publication series

NameProceedings - 2017 IEEE International Conference on Big Knowledge, ICBK 2017

Conference

Conference8th IEEE International Conference on Big Knowledge, ICBK 2017
Country/TerritoryChina
CityHefei
Period9/08/1710/08/17

Keywords

  • Gibbs sampling
  • entity correlation
  • knowledge base
  • prior knowledge
  • topic model

Fingerprint

Dive into the research topics of 'Incorporating Entity Correlation Knowledge into Topic Modeling'. Together they form a unique fingerprint.

Cite this