Abstract
Existing methods of event detection are mainly based on traditional TF-IDF document representation with high dimension and sparse semantics, leading to low efficiency and accuracy. Thus, they are not suitable for large-scale online news event detection. A document representation method based on word embedding is proposed in this paper. By the document representation method, the document representation dimension is reduced, the semantic sparse problem is alleviated and the efficiency and accuracy of document similarity calculation are enhanced. Based on the document representation method, a dynamic online clustering method is proposed for online news event detection. Based on the dynamic online clustering method, both the accuracy and the recall of event detection are improved. Experiments on the standard dataset TDT4 and a real dataset show that the proposed adaptive online event detection method significantly improves the performance of event detection in both efficiency and accuracy compared with the state-of-the-art methods.
Translated title of the contribution | Word Embedding Based Chinese News Event Detection and Representation |
---|---|
Original language | Chinese (Traditional) |
Pages (from-to) | 275-282 |
Number of pages | 8 |
Journal | Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence |
Volume | 31 |
Issue number | 3 |
DOIs | |
Publication status | Published - 1 Mar 2018 |
Externally published | Yes |