Abstract
The rapid development of the microarray technology brings about a great challenge to conventional clustering methods. The sparsity of data (sample), the high dimensionality of feature (gene) space and many irrelevant or redundant features all make it difficult to find correct clusters in gene expression data by using conventional clustering methods directly. In this paper, we present CIS, an algorithm for clustering biological samples using gene expression microarray data. Different from other approaches, CIS iterates between two processes, reclustering genes (not filtering genes) and clustering samples. Inspired by the policy of refining progressively, CIS repeatedly partition the set of initial genes with the new-generated sample clusters as features, and then partition samples with the new-generated gene clusters as features over again to identify the significant sample clusters and relevant genes. The method is applied to two gene microarray data sets, on colon cancer and leukemia. The experiment result show that CIS works well on both two datasets. We partition the two sample sets by eight and twentynine genes respectively, thus both acquire the accuracy about 90%. All these indicate that the CIS might be a promising approach for gene expression data analysis when domain knowledge is absent.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 11th Joint International Computer Conference, JICC 2005 |
| Publisher | World Scientific Publishing Co. Pte Ltd |
| Pages | 651-656 |
| Number of pages | 6 |
| ISBN (Print) | 9812565329, 9789812565327 |
| DOIs | |
| Publication status | Published - 2005 |
| Externally published | Yes |
| Event | 11th Joint International Computer Conference, JICC 2005 - Chongqing, China Duration: 10 Nov 2005 → 12 Nov 2005 |
Publication series
| Name | Proceedings of the 11th Joint International Computer Conference, JICC 2005 |
|---|
Conference
| Conference | 11th Joint International Computer Conference, JICC 2005 |
|---|---|
| Country/Territory | China |
| City | Chongqing |
| Period | 10/11/05 → 12/11/05 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- clustering
- gene expression data
- microarray
- nonparametric clustering
Fingerprint
Dive into the research topics of 'CIS: A nonparametric clustering algorithm for gene expression data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver