Abstract
Clustering is fundamental for using big data. However, AP (affinity propagation) is not good at non-convex datasets, and the input parameter has a marked impact on DBSCAN (density-based spatial clustering of applications with noise). Moreover, new characteristics such as volume, variety, velocity, veracity make it difficult to group big data. To address the issues, a parameter free AP (PFAP) is proposed to group big data on the basis of both distance and density. Firstly, it obtains a group of normalized density from the AP clustering. The estimated parameters are monotonically. Then, the density is used for density clustering for multiple times. Finally, the multiple-density clustering results undergo a two-stage amalgamation to achieve the final clustering result. Experimental results on several benchmark datasets show that PFAP has been achieved better clustering quality than DBSCAN, AP, and APSCAN. And it also has better performance than APSCAN and FSDP.
| Original language | English |
|---|---|
| Pages (from-to) | 63-74 |
| Number of pages | 12 |
| Journal | International Journal of Data Warehousing and Mining |
| Volume | 13 |
| Issue number | 2 |
| DOIs | |
| Publication status | Published - 1 Apr 2017 |
Keywords
- Clustering
- Density
- Distance
- Images
- Parameter Free Affinity Propagation (PFAP)
Fingerprint
Dive into the research topics of 'Robust clustering with distance and density'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver