Abstract
To enhance the performance of the text vector, terms were clustered, which contained similar syntax or semantic feature, to construct concept cluster. The text vector would be transformed from term-space to concept-cluster-space to represent the original text. The experiment compared effects of text classification based on TF-IDF, IG, TF-IDF-IG, LSA, and their combinations with concept cluster. And the results show that, the text vector based on concept cluster improves the accuracy of text concept approaching, and advances the discriminating degree between different types of texts.
Original language | English |
---|---|
Pages (from-to) | 44-47 |
Number of pages | 4 |
Journal | Tongxin Xuebao/Journal on Communications |
Volume | 31 |
Issue number | 8 A |
Publication status | Published - Aug 2010 |
Keywords
- Chinese information processing
- Concept cluster
- Text classification
- Text vector