HCluWin: An algorithm for clustering heterogeneous data streams over sliding windows

Jiadong Ren*, Changzhen Hu, Ruiqing Ma

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)

Abstract

Many applications in web usage mining, such as business intelligence and usage characterization, require effective and efficient techniques to discover the users with similar usage patterns and, the web pages with correlate contents in the physical world. Clustering click streams can help to achieve the goal. Despite the high processing rate, the existing methods for clustering click streams over sliding widows suffer from the missing of categorical attributes in click stream, data. In this paper, we present HCluWin, an approach for clustering heterogeneous data, streams which contain both continuous attributes and, categorical attributes over sliding windows. A Heterogeneous Temporal Cluster Feature (HTCF) is introduced, to m,onitor the distribution statistics of heterogeneous data, points. Based, on this structure, Exponential Histogram, of Heterogeneous Cluster Feature (EHHCF) is presented. Simultaneously, a, new similarity m,ea,sure between two heterogeneous objects is proposed. Experimental results show that the clustering quality of HCluWin is higher than CluWin and, the stream, processing rate of HCluWin is higher than HCluStream,.

Original languageEnglish
Pages (from-to)2171-2179
Number of pages9
JournalInternational Journal of Innovative Computing, Information and Control
Volume6
Issue number5
Publication statusPublished - May 2010

Keywords

  • Clustering
  • Data stream
  • Heterogeneous attribute
  • Sliding windows

Fingerprint

Dive into the research topics of 'HCluWin: An algorithm for clustering heterogeneous data streams over sliding windows'. Together they form a unique fingerprint.

Cite this