Abstract
Existing topic models that aim to discover relationships between topics typically focus on two types of relations: undirected correlation and hierarchy with a tree structure. However, these approaches are limited in their ability to capture complicated dependencies between topics, such as causality, where topics may depend on multiple other topics. To address this limitation, we propose a novel Dependency-Aware Neural Topic Model (DepNTM). This model takes the dependency relationships between topics as a generative process, where each topic is generated by other topics based on their dependencies. Specifically, we utilize a Structural Causal Model within a variational autoencoder (VAE) framework to learn the topic dependencies in a Directed Acyclic Graph (DAG). To further map the latent topics with comprehensible semantic information and ensure interpretability, we use labels of documents as the supervision signal. We conduct experiments on two public real-world corpora from the arXiv system and Stack Overflow website. The experimental results show that DepNTM outperforms nine state-of-the-art (SOTA) topic models in terms of perplexity (reduced by 9% on avg. vs. SOTA baselines), topic quality (improved by 58%) in most cases. We further conduct abundant analytical experiments to demonstrate the reliability of DepNTM.
Original language | English |
---|---|
Article number | 103530 |
Journal | Information Processing and Management |
Volume | 61 |
Issue number | 1 |
DOIs | |
Publication status | Published - Jan 2024 |
Keywords
- Complicated dependency
- Topic modeling
- Topic relationship