TY - GEN
T1 - An algorithm for clustering uncertain data streams over sliding windows
AU - Huang, Guoyan
AU - Liang, Dapeng
AU - Ren, Jiadong
AU - Hu, Changzhen
PY - 2010
Y1 - 2010
N2 - The existing algorithms for clustering data streams with uncertainty can not analyze recent data in detail. In this paper, we propose SWCUStreams (Clustering Uncertain Data Streams over Sliding Windows) to cluster uncertain data streams, which can obtain the distribution character of recent data by maintaining the Exponential Histogram of Uncertainty Cluster Feature (EHUCF). SWCUStreams adopts the clustering framework of CluStream. In the online micro-cluster phase, Uncertainty Temporal Cluster Feature (UTCF) is defined to describe the uncertainty tuples. Based on the Uncertainty Temporal Cluster Feature (UTCF), Exponential Histogram of Uncertainty Cluster Feature is proposed to store the distribution character of recent data as well as used to dynamically delete expired records included in EHUCF by associating with UTCF. In the offline macro-cluster phase, the final clustering results will be generated according to the statistic information of Exponential Histogram of Uncertainty Cluster Feature (EHUCF) by UK-means algorithm. The experimental results over different types of data sets show that the cluster quality of SWCUStreams is higher.
AB - The existing algorithms for clustering data streams with uncertainty can not analyze recent data in detail. In this paper, we propose SWCUStreams (Clustering Uncertain Data Streams over Sliding Windows) to cluster uncertain data streams, which can obtain the distribution character of recent data by maintaining the Exponential Histogram of Uncertainty Cluster Feature (EHUCF). SWCUStreams adopts the clustering framework of CluStream. In the online micro-cluster phase, Uncertainty Temporal Cluster Feature (UTCF) is defined to describe the uncertainty tuples. Based on the Uncertainty Temporal Cluster Feature (UTCF), Exponential Histogram of Uncertainty Cluster Feature is proposed to store the distribution character of recent data as well as used to dynamically delete expired records included in EHUCF by associating with UTCF. In the offline macro-cluster phase, the final clustering results will be generated according to the statistic information of Exponential Histogram of Uncertainty Cluster Feature (EHUCF) by UK-means algorithm. The experimental results over different types of data sets show that the cluster quality of SWCUStreams is higher.
KW - Clustering
KW - Sliding windows
KW - Uncertain data streams
UR - http://www.scopus.com/inward/record.url?scp=77958047220&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:77958047220
SN - 9788988678275
T3 - Proceeding - 6th International Conference on Digital Content, Multimedia Technology and Its Applications, IDC2010
SP - 173
EP - 177
BT - Proceeding - 6th International Conference on Digital Content, Multimedia Technology and Its Applications, IDC2010
T2 - 6th International Conference on Digital Content, Multimedia Technology and Its Applications, IDC2010
Y2 - 16 August 2010 through 18 August 2010
ER -