Abstract
Clustering is an unsupervised learning method widely used for identifying the inherent data structure and applied to various fields such as data mining, patter recognition, machine learning, and others. A new topological clustering method called δ-open set clustering is proposed in this study. The key idea of this method is to determine δ-open sets in data, for which each δ-open set represents one specific category of data. It is shown that this method has robust performance even for complex data set. It can classify the complex type of data sets coming with diverse shapes, recognize noise and deal with data set of high dimensionality. This method is effective even when the distribution of data is unbalanced. In the clustering process, one requires a single input parameter, namely the value of δ. A face identification experiment on the Olivetti Face Database indicates that this method performs much more reliably than the peak clustering method. We also provide another improved δ-open set clustering that makes δ-open set clustering capable of handling clusters with extreme density difference. This article is categorized under: Technologies > Structure Discovery and Clustering Algorithmic Development > Structure Discovery.
Original language | English |
---|---|
Article number | e1262 |
Journal | Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery |
Volume | 8 |
Issue number | 6 |
DOIs | |
Publication status | Published - 1 Nov 2018 |
Keywords
- clustering method
- complex data
- open set
- robust performance