TY - GEN
T1 - Beyond Labels and Topics
T2 - 33rd ACM Web Conference, WWW 2024
AU - Tang, Yi Kun
AU - Huang, Heyan
AU - Shi, Xuewen
AU - Mao, Xian Ling
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/5/13
Y1 - 2024/5/13
N2 - Topic models that can take advantage of labels are broadly used in identifying interpretable topics from textual data. However, existing topic models tend to merely view labels as names of topic clusters or as categories of texts, thereby neglecting the potential causal relationships between supervised information and latent topics, as well as within these elements themselves. In this paper, we focus on uncovering possible causal relationships both between and within the supervised information and latent topics to better understand the mechanisms behind the emergence of the topics and the labels. To this end, we propose Causal Relationship-Aware Neural Topic Model (CRNTM), a novel neural topic model that can automatically uncover interpretable causal relationships between and within supervised information and latent topics, while concurrently discovering high-quality topics. In CRNTM, both supervised information and latent topics are treated as nodes, with the causal relationships represented as directed edges in a Directed Acyclic Graph (DAG). A Structural Causal Model (SCM) is employed to model the DAG. Experiments are conducted on three public corpora with different types of labels. Experimental results show that the discovered causal relationships are both reliable and interpretable, and the learned topics are of high quality comparing with eight start-of-the-art topic model baselines.
AB - Topic models that can take advantage of labels are broadly used in identifying interpretable topics from textual data. However, existing topic models tend to merely view labels as names of topic clusters or as categories of texts, thereby neglecting the potential causal relationships between supervised information and latent topics, as well as within these elements themselves. In this paper, we focus on uncovering possible causal relationships both between and within the supervised information and latent topics to better understand the mechanisms behind the emergence of the topics and the labels. To this end, we propose Causal Relationship-Aware Neural Topic Model (CRNTM), a novel neural topic model that can automatically uncover interpretable causal relationships between and within supervised information and latent topics, while concurrently discovering high-quality topics. In CRNTM, both supervised information and latent topics are treated as nodes, with the causal relationships represented as directed edges in a Directed Acyclic Graph (DAG). A Structural Causal Model (SCM) is employed to model the DAG. Experiments are conducted on three public corpora with different types of labels. Experimental results show that the discovered causal relationships are both reliable and interpretable, and the learned topics are of high quality comparing with eight start-of-the-art topic model baselines.
KW - causal relationships discovery
KW - neural topic model
KW - structural causal model
UR - http://www.scopus.com/inward/record.url?scp=85194064110&partnerID=8YFLogxK
U2 - 10.1145/3589334.3645715
DO - 10.1145/3589334.3645715
M3 - Conference contribution
AN - SCOPUS:85194064110
T3 - WWW 2024 - Proceedings of the ACM Web Conference
SP - 4460
EP - 4469
BT - WWW 2024 - Proceedings of the ACM Web Conference
PB - Association for Computing Machinery, Inc
Y2 - 13 May 2024 through 17 May 2024
ER -