Robust K-median and K-means clustering algorithms for incomplete data

Jinhua Li, Shiji Song*, Yuli Zhang, Zhen Zhou

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

26 引用 (Scopus)

摘要

Incomplete data with missing feature values are prevalent in clustering problems. Traditional clustering methods first estimate the missing values by imputation and then apply the classical clustering algorithms for complete data, such as K-median and K-means. However, in practice, it is often hard to obtain accurate estimation of the missing values, which deteriorates the performance of clustering. To enhance the robustness of clustering algorithms, this paper represents the missing values by interval data and introduces the concept of robust cluster objective function. A minimax robust optimization (RO) formulation is presented to provide clustering results, which are insensitive to estimation errors. To solve the proposed RO problem, we propose robust K-median and K-means clustering algorithms with low time and space complexity. Comparisons and analysis of experimental results on both artificially generated and real-world incomplete data sets validate the robustness and effectiveness of the proposed algorithms.

源语言英语
文章编号4321928
期刊Mathematical Problems in Engineering
2016
DOI
出版状态已出版 - 2016
已对外发布

指纹

探究 'Robust K-median and K-means clustering algorithms for incomplete data' 的科研主题。它们共同构成独一无二的指纹。

引用此