Robust clustering with distance and density

Hanning Yuan*, Shuliang Wang, Jing Geng, Yang Yu, Ming Zhong

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Clustering is fundamental for using big data. However, AP (affinity propagation) is not good at non-convex datasets, and the input parameter has a marked impact on DBSCAN (density-based spatial clustering of applications with noise). Moreover, new characteristics such as volume, variety, velocity, veracity make it difficult to group big data. To address the issues, a parameter free AP (PFAP) is proposed to group big data on the basis of both distance and density. Firstly, it obtains a group of normalized density from the AP clustering. The estimated parameters are monotonically. Then, the density is used for density clustering for multiple times. Finally, the multiple-density clustering results undergo a two-stage amalgamation to achieve the final clustering result. Experimental results on several benchmark datasets show that PFAP has been achieved better clustering quality than DBSCAN, AP, and APSCAN. And it also has better performance than APSCAN and FSDP.

Original languageEnglish
Pages (from-to)63-74
Number of pages12
JournalInternational Journal of Data Warehousing and Mining
Volume13
Issue number2
DOIs
Publication statusPublished - 1 Apr 2017

Keywords

  • Clustering
  • Density
  • Distance
  • Images
  • Parameter Free Affinity Propagation (PFAP)

Fingerprint

Dive into the research topics of 'Robust clustering with distance and density'. Together they form a unique fingerprint.

Cite this