A supervised parameter estimation method of LDA

Zhenyan Liu, Dan Meng, Weiping Wang, Chunxia Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Latent Dirichlet Allocation (LDA) probabilistic topic model is a very effective dimension-reduction tool which can automatically extract latent topics and dedicate to text representation in a lower-dimensional semantic topic space. But the original LDA and its most variants are unsupervised without reference to category label of the documents in the training corpus. And most of them view the terms in vocabulary as equally important, but the weight of each term is different, especially for a skewed corpus in which there are many more samples of some categories than others. As a result, we propose a supervised parameter estimation method based on category and document information which can estimate the parameters of LDA according to term weight. The comparative experiments show that the proposed method is superior for the skewed text classification, which can largely improve the recall and precision of the minority category.

Original languageEnglish
Title of host publicationWeb Technologies and Applications - 17th Asia-PacificWeb Conference,APWeb 2015, Proceedings
EditorsReynold Cheng, Bin Cui, Zhenjie Zhang, Ruichu Cai, Jia Xu
PublisherSpringer Verlag
Pages401-410
Number of pages10
ISBN (Print)9783319252544
DOIs
Publication statusPublished - 2015
Externally publishedYes
Event17th Asia-PacificWeb Conference, APWeb 2015 - Guangzhou, China
Duration: 18 Sept 201520 Sept 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9313
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th Asia-PacificWeb Conference, APWeb 2015
Country/TerritoryChina
CityGuangzhou
Period18/09/1520/09/15

Keywords

  • Gibbs sampling
  • LDA
  • Parameter estimation
  • Skewed text classification
  • Term weighting

Fingerprint

Dive into the research topics of 'A supervised parameter estimation method of LDA'. Together they form a unique fingerprint.

Cite this