TY - JOUR
T1 - Robust clustering with distance and density
AU - Yuan, Hanning
AU - Wang, Shuliang
AU - Geng, Jing
AU - Yu, Yang
AU - Zhong, Ming
N1 - Publisher Copyright:
Copyright © 2017, IGI Global.
PY - 2017/4/1
Y1 - 2017/4/1
N2 - Clustering is fundamental for using big data. However, AP (affinity propagation) is not good at non-convex datasets, and the input parameter has a marked impact on DBSCAN (density-based spatial clustering of applications with noise). Moreover, new characteristics such as volume, variety, velocity, veracity make it difficult to group big data. To address the issues, a parameter free AP (PFAP) is proposed to group big data on the basis of both distance and density. Firstly, it obtains a group of normalized density from the AP clustering. The estimated parameters are monotonically. Then, the density is used for density clustering for multiple times. Finally, the multiple-density clustering results undergo a two-stage amalgamation to achieve the final clustering result. Experimental results on several benchmark datasets show that PFAP has been achieved better clustering quality than DBSCAN, AP, and APSCAN. And it also has better performance than APSCAN and FSDP.
AB - Clustering is fundamental for using big data. However, AP (affinity propagation) is not good at non-convex datasets, and the input parameter has a marked impact on DBSCAN (density-based spatial clustering of applications with noise). Moreover, new characteristics such as volume, variety, velocity, veracity make it difficult to group big data. To address the issues, a parameter free AP (PFAP) is proposed to group big data on the basis of both distance and density. Firstly, it obtains a group of normalized density from the AP clustering. The estimated parameters are monotonically. Then, the density is used for density clustering for multiple times. Finally, the multiple-density clustering results undergo a two-stage amalgamation to achieve the final clustering result. Experimental results on several benchmark datasets show that PFAP has been achieved better clustering quality than DBSCAN, AP, and APSCAN. And it also has better performance than APSCAN and FSDP.
KW - Clustering
KW - Density
KW - Distance
KW - Images
KW - Parameter Free Affinity Propagation (PFAP)
UR - http://www.scopus.com/inward/record.url?scp=85019359798&partnerID=8YFLogxK
U2 - 10.4018/IJDWM.2017040104
DO - 10.4018/IJDWM.2017040104
M3 - Article
AN - SCOPUS:85019359798
SN - 1548-3924
VL - 13
SP - 63
EP - 74
JO - International Journal of Data Warehousing and Mining
JF - International Journal of Data Warehousing and Mining
IS - 2
ER -