TY - JOUR
T1 - Robust K-median and K-means clustering algorithms for incomplete data
AU - Li, Jinhua
AU - Song, Shiji
AU - Zhang, Yuli
AU - Zhou, Zhen
N1 - Publisher Copyright:
© 2016 Jinhua Li et al.
PY - 2016
Y1 - 2016
N2 - Incomplete data with missing feature values are prevalent in clustering problems. Traditional clustering methods first estimate the missing values by imputation and then apply the classical clustering algorithms for complete data, such as K-median and K-means. However, in practice, it is often hard to obtain accurate estimation of the missing values, which deteriorates the performance of clustering. To enhance the robustness of clustering algorithms, this paper represents the missing values by interval data and introduces the concept of robust cluster objective function. A minimax robust optimization (RO) formulation is presented to provide clustering results, which are insensitive to estimation errors. To solve the proposed RO problem, we propose robust K-median and K-means clustering algorithms with low time and space complexity. Comparisons and analysis of experimental results on both artificially generated and real-world incomplete data sets validate the robustness and effectiveness of the proposed algorithms.
AB - Incomplete data with missing feature values are prevalent in clustering problems. Traditional clustering methods first estimate the missing values by imputation and then apply the classical clustering algorithms for complete data, such as K-median and K-means. However, in practice, it is often hard to obtain accurate estimation of the missing values, which deteriorates the performance of clustering. To enhance the robustness of clustering algorithms, this paper represents the missing values by interval data and introduces the concept of robust cluster objective function. A minimax robust optimization (RO) formulation is presented to provide clustering results, which are insensitive to estimation errors. To solve the proposed RO problem, we propose robust K-median and K-means clustering algorithms with low time and space complexity. Comparisons and analysis of experimental results on both artificially generated and real-world incomplete data sets validate the robustness and effectiveness of the proposed algorithms.
UR - http://www.scopus.com/inward/record.url?scp=85008608296&partnerID=8YFLogxK
U2 - 10.1155/2016/4321928
DO - 10.1155/2016/4321928
M3 - Article
AN - SCOPUS:85008608296
SN - 1024-123X
VL - 2016
JO - Mathematical Problems in Engineering
JF - Mathematical Problems in Engineering
M1 - 4321928
ER -