Feature selection for text clustering based on the genetic algorithm

  • Feng Zhang*
  • , Xiao Zhong Fan
  • , Yun Xu
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

As the traditional feature selection methods for text clustering cannot find the best feature set, the genetic algorithm is applied to the feature selection because it can get the global optimal solution and is of high searching efficiency. In this algorithm, a feature combination is regarded as a chromosome which is then performed with binary code, and the text set density is considered as the fitness function to evaluate the fitness of individual feature. By the operations of selection, crossover and mutation, the optimal feature set can rapidly be rapidly obtained. Experimental results on the open corpus show that the feature selection based on the genetic algorithm improves the text clustering precision by 5.9% and decreases the clustering time by 15 s.

Original languageEnglish
Pages (from-to)133-136
Number of pages4
JournalHuanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science)
Volume32
Issue numberSUPPL.
Publication statusPublished - Nov 2004

Keywords

  • Chinese information processing
  • Feature selection
  • Genetic algorithm
  • Text clustering

Fingerprint

Dive into the research topics of 'Feature selection for text clustering based on the genetic algorithm'. Together they form a unique fingerprint.

Cite this