CIS: A nonparametric clustering algorithm for gene expression data

Yuhai Zhao, Ying Yin, Guoren Wang, Keming Mao

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The rapid development of the microarray technology brings about a great challenge to conventional clustering methods. The sparsity of data (sample), the high dimensionality of feature (gene) space and many irrelevant or redundant features all make it difficult to find correct clusters in gene expression data by using conventional clustering methods directly. In this paper, we present CIS, an algorithm for clustering biological samples using gene expression microarray data. Different from other approaches, CIS iterates between two processes, reclustering genes (not filtering genes) and clustering samples. Inspired by the policy of refining progressively, CIS repeatedly partition the set of initial genes with the new-generated sample clusters as features, and then partition samples with the new-generated gene clusters as features over again to identify the significant sample clusters and relevant genes. The method is applied to two gene microarray data sets, on colon cancer and leukemia. The experiment result show that CIS works well on both two datasets. We partition the two sample sets by eight and twentynine genes respectively, thus both acquire the accuracy about 90%. All these indicate that the CIS might be a promising approach for gene expression data analysis when domain knowledge is absent.

Original languageEnglish
Title of host publicationProceedings of the 11th Joint International Computer Conference, JICC 2005
PublisherWorld Scientific Publishing Co. Pte Ltd
Pages651-656
Number of pages6
ISBN (Print)9812565329, 9789812565327
DOIs
Publication statusPublished - 2005
Externally publishedYes
Event11th Joint International Computer Conference, JICC 2005 - Chongqing, China
Duration: 10 Nov 200512 Nov 2005

Publication series

NameProceedings of the 11th Joint International Computer Conference, JICC 2005

Conference

Conference11th Joint International Computer Conference, JICC 2005
Country/TerritoryChina
CityChongqing
Period10/11/0512/11/05

Keywords

  • clustering
  • gene expression data
  • microarray
  • nonparametric clustering

Fingerprint

Dive into the research topics of 'CIS: A nonparametric clustering algorithm for gene expression data'. Together they form a unique fingerprint.

Cite this