TY - GEN
T1 - A novel method for identifying optimal number of clusters with marginal differential entropy
AU - Shu, Bo
AU - Chen, Wei
AU - Niu, Zhendong
AU - Zhang, Changmin
AU - Jiang, Xiaotian
PY - 2013
Y1 - 2013
N2 - Clustering evaluation plays an important role in clustering algorithms. Most of recent approaches about clustering that evaluate and identify the optimal number of clusters need to calculate the distances between data points pair-wisely or evaluate the entropy in the entire dimension space and have high computational complexity. In this paper, we propose an entropy-based clustering evaluation method for identifying the optimal number of clusters which first projects the clusters centroids to each of its individual dimensions, then accumulates the marginal differential entropy in each dimension. With the sum of marginal entropies we can analyze the performance and identify the optimal number of clusters. This method can dramatically reduce the computational complexity without losing accuracy. Experiment results show that the proposed method has high stability under various situations and can apply to massive high-dimensional data points.
AB - Clustering evaluation plays an important role in clustering algorithms. Most of recent approaches about clustering that evaluate and identify the optimal number of clusters need to calculate the distances between data points pair-wisely or evaluate the entropy in the entire dimension space and have high computational complexity. In this paper, we propose an entropy-based clustering evaluation method for identifying the optimal number of clusters which first projects the clusters centroids to each of its individual dimensions, then accumulates the marginal differential entropy in each dimension. With the sum of marginal entropies we can analyze the performance and identify the optimal number of clusters. This method can dramatically reduce the computational complexity without losing accuracy. Experiment results show that the proposed method has high stability under various situations and can apply to massive high-dimensional data points.
KW - Clustering Evaluation
KW - Differential Entropy
KW - Information Theory
UR - https://www.scopus.com/pages/publications/84893108243
U2 - 10.1007/978-3-642-39527-7_36
DO - 10.1007/978-3-642-39527-7_36
M3 - Conference contribution
AN - SCOPUS:84893108243
SN - 9783642395260
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 371
EP - 382
BT - Web-Age Information Management - WAIM 2013, International Workshops
T2 - 14th International Conference on Web-Age Information Management, WAIM 2013
Y2 - 14 June 2013 through 16 June 2013
ER -