A hierarchical clustering algorithm based on K-means with constraints

Guoyan Hang*, Dongmei Zhang, Jiadong Ren, Changzhen Hu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Citations (Scopus)

Abstract

Hierarchical clustering is one of the most important tasks in data mining. However, the existing hierarchical clustering algorithms are time-consuming, and have low clustering quality because of ignoring the constraints. In this paper, a Hierarchical Clustering Algorithm based on K-means with Constraints (HCAKC) is proposed. In HCAKC, in order to improve the clustering efficiency, Improved Silhouette is defined to determine the optimal number of clusters. In addition, to improve the hierarchical clustering quality, the existing pairwise must-link and cannot-link constraints are adopted to update the cohesion matrix between clusters. Penalty factor is introduced to modify the similarity metric to address the constraint violation. The experimental results show that HCAKC has lower computational complexity and better clustering quality compared with the existing algorithm CSM.

Original languageEnglish
Title of host publication2009 4th International Conference on Innovative Computing, Information and Control, ICICIC 2009
Pages1479-1482
Number of pages4
DOIs
Publication statusPublished - 2009
Event2009 4th International Conference on Innovative Computing, Information and Control, ICICIC 2009 - Kaohsiung, Taiwan, Province of China
Duration: 7 Dec 20099 Dec 2009

Publication series

Name2009 4th International Conference on Innovative Computing, Information and Control, ICICIC 2009

Conference

Conference2009 4th International Conference on Innovative Computing, Information and Control, ICICIC 2009
Country/TerritoryTaiwan, Province of China
CityKaohsiung
Period7/12/099/12/09

Keywords

  • Constraints
  • Hierarchical clustering
  • Improved silhouette
  • K-means

Fingerprint

Dive into the research topics of 'A hierarchical clustering algorithm based on K-means with constraints'. Together they form a unique fingerprint.

Cite this