跳到主要导航 跳到搜索 跳到主要内容

A supervised parameter estimation method of LDA

  • CAS - Institute of Computing Technology
  • University of Chinese Academy of Sciences
  • CAS - Institute of Information Engineering
  • Beijing Institute of Technology

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Latent Dirichlet Allocation (LDA) probabilistic topic model is a very effective dimension-reduction tool which can automatically extract latent topics and dedicate to text representation in a lower-dimensional semantic topic space. But the original LDA and its most variants are unsupervised without reference to category label of the documents in the training corpus. And most of them view the terms in vocabulary as equally important, but the weight of each term is different, especially for a skewed corpus in which there are many more samples of some categories than others. As a result, we propose a supervised parameter estimation method based on category and document information which can estimate the parameters of LDA according to term weight. The comparative experiments show that the proposed method is superior for the skewed text classification, which can largely improve the recall and precision of the minority category.

源语言英语
主期刊名Web Technologies and Applications - 17th Asia-PacificWeb Conference,APWeb 2015, Proceedings
编辑Reynold Cheng, Bin Cui, Zhenjie Zhang, Ruichu Cai, Jia Xu
出版商Springer Verlag
401-410
页数10
ISBN(印刷版)9783319252544
DOI
出版状态已出版 - 2015
已对外发布
活动17th Asia-PacificWeb Conference, APWeb 2015 - Guangzhou, 中国
期限: 18 9月 201520 9月 2015

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
9313
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议17th Asia-PacificWeb Conference, APWeb 2015
国家/地区中国
Guangzhou
时期18/09/1520/09/15

指纹

探究 'A supervised parameter estimation method of LDA' 的科研主题。它们共同构成独一无二的指纹。

引用此