Clustering over an evolving data stream based on grid density and correlation

Jiadong Ren*, Binlei Cai, Changzhen Hu

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

7 引用 (Scopus)

摘要

Most existing grid-based clustering algorithms are incompetent to the evolution of data streams and can not handled noise points effectively. Further, data points of the cluster edge can not be clustered accurately. In this paper, we present GDC-Stream, a new approach for clustering evolving data streams, which is based on grid density and correlation. A new time-based density threshold function is introduced to remove the noise points in real time. Moreover, a novel correlation-based technology is adopted to improve the accuracy of clustering. In the initial stage of the algorithm, the data stream is clustered by grid density, when new data records arriving, the novel pruning strategy is adopted to periodically inspect and remove noise points. Meanwhile, based on grid density and correlation, the generated clusters are dynamically adjusted to capture the changes of the data stream. The experimental results show that GDC-Stream has better clustering quality and scalability than CluStream. ICIC International

源语言英语
页(从-至)1603-1609
页数7
期刊ICIC Express Letters
4
5
出版状态已出版 - 10月 2010

指纹

探究 'Clustering over an evolving data stream based on grid density and correlation' 的科研主题。它们共同构成独一无二的指纹。

引用此