Abstract
In this paper, a novel clustering algorithm, HASTA (HierArchical-grid cluStering based on daTA field), is proposed to model the dataset as a data field by assigning all the data objects into qusantized grids. Clustering centers of HASTA are defined to locate where the maximum value of local potential is. Edges of cluster in HASTA are identified by analyzing the first-order partial derivative of potential value, thus the full size of arbitrary shaped clusters can be detected. The experimented case demonstrates that HASTA performs effectively upon different datasets and can find out clusters of arbitrary shapes in noisy circumstance. Besides those, HASTA does not force users to preset the exact amount of clusters inside dataset. Furthermore, HASTA is insensitive to the order of data input. The time complexity of HASTA achieves O(n). Those advantages will potentially benefit the mining of big data.
| Original language | English |
|---|---|
| Pages (from-to) | 39-54 |
| Number of pages | 16 |
| Journal | International Journal of Data Warehousing and Mining |
| Volume | 10 |
| Issue number | 2 |
| DOIs | |
| Publication status | Published - 1 Apr 2014 |
| Externally published | Yes |
Keywords
- Clustering algorithms
- Data field
- Data mining
- HASTA
- Potential value
Fingerprint
Dive into the research topics of 'HASTA: A hierarchical-grid clustering algorithm with data field'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver