HASTA: A hierarchical-grid clustering algorithm with data field

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper, a novel clustering algorithm, HASTA (HierArchical-grid cluStering based on daTA field), is proposed to model the dataset as a data field by assigning all the data objects into qusantized grids. Clustering centers of HASTA are defined to locate where the maximum value of local potential is. Edges of cluster in HASTA are identified by analyzing the first-order partial derivative of potential value, thus the full size of arbitrary shaped clusters can be detected. The experimented case demonstrates that HASTA performs effectively upon different datasets and can find out clusters of arbitrary shapes in noisy circumstance. Besides those, HASTA does not force users to preset the exact amount of clusters inside dataset. Furthermore, HASTA is insensitive to the order of data input. The time complexity of HASTA achieves O(n). Those advantages will potentially benefit the mining of big data.

Original languageEnglish
Pages (from-to)39-54
Number of pages16
JournalInternational Journal of Data Warehousing and Mining
Volume10
Issue number2
DOIs
Publication statusPublished - 1 Apr 2014
Externally publishedYes

Keywords

  • Clustering algorithms
  • Data field
  • Data mining
  • HASTA
  • Potential value

Fingerprint

Dive into the research topics of 'HASTA: A hierarchical-grid clustering algorithm with data field'. Together they form a unique fingerprint.

Cite this