A novel fast framework for topic labeling based on similarity-preserved hashing

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Recently, topic modeling has been widely applied in data mining due to its powerful ability. A common, major challenge in applying such topic models to other tasks is to accurately interpret the meaning of each topic. Topic labeling, as a major interpreting method, has attracted significant attention recently. However, most of previous works only focus on the effectiveness of topic labeling, and less attention has been paid to quickly creating good topic descriptors; Meanwhile, it's hard to assign labels for new emerging topics by using most of existing methods. To solve the problems above, in this paper, we propose a novel fast topic labeling framework that casts the labeling problem as a k-nearest neighbor (KNN) search problem in probability distributions. Our experimental results show that the proposed sequential interleaving method based on locality sensitive hashing (LSH) technology is efficient in boosting the comparison speed among probability distributions, and the proposed framework can generate meaningful labels to interpret topics, including new emerging topics.

Original languageEnglish
Title of host publicationCOLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016
Subtitle of host publicationTechnical Papers
PublisherAssociation for Computational Linguistics, ACL Anthology
Pages3339-3348
Number of pages10
ISBN (Print)9784879747020
Publication statusPublished - 2016
Event26th International Conference on Computational Linguistics, COLING 2016 - Osaka, Japan
Duration: 11 Dec 201616 Dec 2016

Publication series

NameCOLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers

Conference

Conference26th International Conference on Computational Linguistics, COLING 2016
Country/TerritoryJapan
CityOsaka
Period11/12/1616/12/16

Fingerprint

Dive into the research topics of 'A novel fast framework for topic labeling based on similarity-preserved hashing'. Together they form a unique fingerprint.

Cite this