Abstract
As the traditional feature selection methods for text clustering cannot find the best feature set, the genetic algorithm is applied to the feature selection because it can get the global optimal solution and is of high searching efficiency. In this algorithm, a feature combination is regarded as a chromosome which is then performed with binary code, and the text set density is considered as the fitness function to evaluate the fitness of individual feature. By the operations of selection, crossover and mutation, the optimal feature set can rapidly be rapidly obtained. Experimental results on the open corpus show that the feature selection based on the genetic algorithm improves the text clustering precision by 5.9% and decreases the clustering time by 15 s.
| Original language | English |
|---|---|
| Pages (from-to) | 133-136 |
| Number of pages | 4 |
| Journal | Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science) |
| Volume | 32 |
| Issue number | SUPPL. |
| Publication status | Published - Nov 2004 |
Keywords
- Chinese information processing
- Feature selection
- Genetic algorithm
- Text clustering