TY - GEN
T1 - A hierarchical clustering algorithm based on K-means with constraints
AU - Hang, Guoyan
AU - Zhang, Dongmei
AU - Ren, Jiadong
AU - Hu, Changzhen
PY - 2009
Y1 - 2009
N2 - Hierarchical clustering is one of the most important tasks in data mining. However, the existing hierarchical clustering algorithms are time-consuming, and have low clustering quality because of ignoring the constraints. In this paper, a Hierarchical Clustering Algorithm based on K-means with Constraints (HCAKC) is proposed. In HCAKC, in order to improve the clustering efficiency, Improved Silhouette is defined to determine the optimal number of clusters. In addition, to improve the hierarchical clustering quality, the existing pairwise must-link and cannot-link constraints are adopted to update the cohesion matrix between clusters. Penalty factor is introduced to modify the similarity metric to address the constraint violation. The experimental results show that HCAKC has lower computational complexity and better clustering quality compared with the existing algorithm CSM.
AB - Hierarchical clustering is one of the most important tasks in data mining. However, the existing hierarchical clustering algorithms are time-consuming, and have low clustering quality because of ignoring the constraints. In this paper, a Hierarchical Clustering Algorithm based on K-means with Constraints (HCAKC) is proposed. In HCAKC, in order to improve the clustering efficiency, Improved Silhouette is defined to determine the optimal number of clusters. In addition, to improve the hierarchical clustering quality, the existing pairwise must-link and cannot-link constraints are adopted to update the cohesion matrix between clusters. Penalty factor is introduced to modify the similarity metric to address the constraint violation. The experimental results show that HCAKC has lower computational complexity and better clustering quality compared with the existing algorithm CSM.
KW - Constraints
KW - Hierarchical clustering
KW - Improved silhouette
KW - K-means
UR - http://www.scopus.com/inward/record.url?scp=77951437801&partnerID=8YFLogxK
U2 - 10.1109/ICICIC.2009.18
DO - 10.1109/ICICIC.2009.18
M3 - Conference contribution
AN - SCOPUS:77951437801
SN - 9780769538730
T3 - 2009 4th International Conference on Innovative Computing, Information and Control, ICICIC 2009
SP - 1479
EP - 1482
BT - 2009 4th International Conference on Innovative Computing, Information and Control, ICICIC 2009
T2 - 2009 4th International Conference on Innovative Computing, Information and Control, ICICIC 2009
Y2 - 7 December 2009 through 9 December 2009
ER -