摘要
Existing methods of event detection are mainly based on traditional TF-IDF document representation with high dimension and sparse semantics, leading to low efficiency and accuracy. Thus, they are not suitable for large-scale online news event detection. A document representation method based on word embedding is proposed in this paper. By the document representation method, the document representation dimension is reduced, the semantic sparse problem is alleviated and the efficiency and accuracy of document similarity calculation are enhanced. Based on the document representation method, a dynamic online clustering method is proposed for online news event detection. Based on the dynamic online clustering method, both the accuracy and the recall of event detection are improved. Experiments on the standard dataset TDT4 and a real dataset show that the proposed adaptive online event detection method significantly improves the performance of event detection in both efficiency and accuracy compared with the state-of-the-art methods.
投稿的翻译标题 | Word Embedding Based Chinese News Event Detection and Representation |
---|---|
源语言 | 繁体中文 |
页(从-至) | 275-282 |
页数 | 8 |
期刊 | Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence |
卷 | 31 |
期 | 3 |
DOI | |
出版状态 | 已出版 - 1 3月 2018 |
已对外发布 | 是 |
关键词
- Dynamic Online Clustering
- Event Detection
- Word Embedding