Knowledge base enhanced topic modeling

Dandan Song, Jingwen Gao, Jinhui Pang, Lejian Liao, Lifei Qin

科研成果: 书/报告/会议事项章节会议稿件同行评审

4 引用 (Scopus)

摘要

Topic models, such as Latent Dirichlet Allocation (LDA), are successful in learning hidden topics and has been widely applied in text mining. There are many recently developed augmented topic modeling methods to utilize metadata information. However, the effect of topic models is still not comparable to humans. We think one key point is that humans have background knowledge, which is essential for topic understanding. Inspired by this, we propose a knowledge base enhanced topic model in this paper. We take knowledge bases as good presentations of human knowledge, with huge collections of entities and their relations. We assume that documents with related entities tend to have similar topic distributions. Based on this assumption, we compute document similarity information via the linked entities and then use it as a constraint for LDA. More specifically, we embed entities in a low-dimensional space via DeepWalk and use Entity Movers Distance to efficiently and effectively measure the similarities between documents. The results of experiments over two real-world datasets show that our method boosts the LDA model on the document classification while no supervision information is needed.

源语言英语
主期刊名Proceedings - 11th IEEE International Conference on Knowledge Graph, ICKG 2020
编辑Enhong Chen, Grigoris Antoniou, Xindong Wu, Vipin Kumar
出版商Institute of Electrical and Electronics Engineers Inc.
380-387
页数8
ISBN(电子版)9781728181561
DOI
出版状态已出版 - 8月 2020
活动11th IEEE International Conference on Knowledge Graph, ICKG 2020 - Virtual, Nanjing, 中国
期限: 9 8月 202011 8月 2020

出版系列

姓名Proceedings - 11th IEEE International Conference on Knowledge Graph, ICKG 2020

会议

会议11th IEEE International Conference on Knowledge Graph, ICKG 2020
国家/地区中国
Virtual, Nanjing
时期9/08/2011/08/20

指纹

探究 'Knowledge base enhanced topic modeling' 的科研主题。它们共同构成独一无二的指纹。

引用此