Locally weighted embedding topic modeling by markov random walk structure approximation and sparse regularization

Chao Wei, Senlin Luo, Limin Pan*, Zhouting Wu, Ji Zhang, Qamas Gul Khan Safi

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Topic model is a practical method for learning interpretable models of text corpora and have become a key problem of document representation. Some recently proposed topic models incorporate the intrinsic geometrical information of the document manifold and yield a discriminative topic representation. However, the existing manifold-inspired topic models fail to provide the probability weighting information of local geometrical pattern, thus leads to a limitation to estimate intrinsic semantic information of topic representation. In this paper, we consider the problem of topic modeling with intrinsic structure of document manifold and propose an unsupervised AutoEncoder-based topic modeling framework, named locally weighted embedding topic model (LWE-TM). Different from existing manifold-inspired topic models, LWE-TM defines a group of probability coefficients to uncover the local geometrical pattern by the Markov random walk structure of affinity graph, and regularizes the training of sparse AutoEncoder (sAE) to explicitly recover such local geometrical pattern with the topics encoding. Under the regularized training framework, the encoding network becomes local-invariant around the neighborhood of the document manifold and enable us to perform a readily topic inference for out-of-sample documents, efficiently improving the generalization and discrimination of topics encoding. The experimental results on two widely-used corpus demonstrate the superiority of LWE-TM to comparative models in document modeling, document clustering and classification tasks.

Original languageEnglish
Pages (from-to)35-50
Number of pages16
JournalNeurocomputing
Volume285
DOIs
Publication statusPublished - 12 Apr 2018

Keywords

  • Affine mapping
  • Markov random walk
  • Sparse AutoEncoder
  • Topic model

Fingerprint

Dive into the research topics of 'Locally weighted embedding topic modeling by markov random walk structure approximation and sparse regularization'. Together they form a unique fingerprint.

Cite this