TY - GEN
T1 - Optimal bandwidth selection for density-based clustering
AU - Jin, Hong
AU - Wang, Shuliang
AU - Zhou, Qian
AU - Li, Ying
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2011.
PY - 2011
Y1 - 2011
N2 - Cluster analysis has long played an important role in a wide variety of data applications. When the clusters are irregular or intertwined, density-based clustering is proved to be much more efficient. The quality of clustering result depends on an adequate choice of the parameters. However, withouenough domain knowledge the parameter setting is somewhat limited in its operability. In this paper, a new method is proposed to automatically find outhe optimal parameter value of the bandwidth. It is to infer the most suitable parameter value by the constructed model on parameter estimation. Based on the Bayesian Theorem, from which the most probability value for the bandwidth can be acquired in accordance with the inherent distribution characteristics of the original data set. Clusters can then be identified by the determined parameter values. The results of the experiment show that the proposed method has complementary advantages in the density-based clustering algorithm.
AB - Cluster analysis has long played an important role in a wide variety of data applications. When the clusters are irregular or intertwined, density-based clustering is proved to be much more efficient. The quality of clustering result depends on an adequate choice of the parameters. However, withouenough domain knowledge the parameter setting is somewhat limited in its operability. In this paper, a new method is proposed to automatically find outhe optimal parameter value of the bandwidth. It is to infer the most suitable parameter value by the constructed model on parameter estimation. Based on the Bayesian Theorem, from which the most probability value for the bandwidth can be acquired in accordance with the inherent distribution characteristics of the original data set. Clusters can then be identified by the determined parameter values. The results of the experiment show that the proposed method has complementary advantages in the density-based clustering algorithm.
KW - Bayesian posterior probability estimation
KW - Density-based clustering
KW - Optimal bandwidth selection
UR - http://www.scopus.com/inward/record.url?scp=85025128098&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-20244-5_15
DO - 10.1007/978-3-642-20244-5_15
M3 - Conference contribution
AN - SCOPUS:85025128098
SN - 9783642202438
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 156
EP - 167
BT - Database Systems for Adanced Applications - 16th International Conference, DASFAA 2011, International Workshops
A2 - Xu, Jianliang
A2 - Yu, Ge
A2 - Zhou, Shuigeng
A2 - Unland, Rainer
PB - Springer Verlag
T2 - 16th International Conference on Database Systems for Advanced Applications, DASFAA 2011
Y2 - 22 April 2011 through 25 April 2011
ER -