A hot topic detection method for Chinese Microblog based on topic words

Jun Zheng*, Yuanjun Li

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

4 引用 (Scopus)

摘要

Microblog is a kind of new network medium which sprang up quickly. Detection and tracking of hot topics through Microblog has attracted wide attentions from scholars at home and abroad in recent years. The algorithm which aims at finding topics in long text messages such as in traditional news websites and blogs, etc. can't effectively be used in disposing the Microblog data with a property of sparseness. This paper contributes a method, which aims to identify hot topics in Microblog based on the topic words. This method, throughpre-treating the Microblog data and dividing the time-window, extracts topic words in the Microblog data according to the two factors of increasing rate of word frequency and relative word frequency from Microblog data in every time-window. And then extracts and clusters the topic words according to the similarity among them, sieving for a suitable cluster of topic words so as to describe the hot topic and realize the detection of hot topic in Microblog. Through experimental verification, this method can improve the efficiency of detection to a certain extent, and raise the recall ratio and the precision ratio, so as to find hot topic in Microblog effectively and timely.

源语言英语
主期刊名Proceedings of 2nd International Conference on Information Technology and Electronic Commerce, ICITEC 2014
出版商Institute of Electrical and Electronics Engineers Inc.
262-266
页数5
ISBN(电子版)9781479952984
DOI
出版状态已出版 - 11 5月 2014
活动2nd International Conference on Information Technology and Electronic Commerce, ICITEC 2014 - Dalian, 中国
期限: 20 12月 201421 12月 2014

出版系列

姓名Proceedings of 2nd International Conference on Information Technology and Electronic Commerce, ICITEC 2014

会议

会议2nd International Conference on Information Technology and Electronic Commerce, ICITEC 2014
国家/地区中国
Dalian
时期20/12/1421/12/14

指纹

探究 'A hot topic detection method for Chinese Microblog based on topic words' 的科研主题。它们共同构成独一无二的指纹。

引用此