TY - JOUR
T1 - A Euclidean Distance Matrix Model for Convex Clustering
AU - Wang, Z. W.
AU - Liu, X. W.
AU - Li, Q. N.
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
PY - 2025/4
Y1 - 2025/4
N2 - Clustering has been one of the most basic and essential problems in unsupervised learning due to various applications in many critical fields. The recently proposed sum-of-norms (SON) model by Pelckmans et al. (in: PASCAL workshop on statistics and optimization of clustering, 2005), Lindsten et al. (in: IEEE statistical signal processing workshop, 2011) and Hocking et al. (in: Proceedings of the 28th international conference on international conference on machine learning, 2011) has received a lot of attention. The advantage of the SON model is the theoretical guarantee in terms of perfect recovery, established by Sun et al. (J Mach Learn Res 22(9):1–32, 2018). It also provides great opportunities for designing efficient algorithms for solving the SON model. The semismooth Newton based augmented Lagrangian method by Sun et al. (2018) has demonstrated its superior performance over the alternating direction method of multipliers and the alternating minimization algorithm. In this paper, we propose a Euclidean distance matrix model based on the SON model. Exact recovery property is achieved under proper assumptions. An efficient majorization penalty algorithm is proposed to solve the resulting model. Extensive numerical experiments are conducted to demonstrate the efficiency of the proposed model and the majorization penalty algorithm.
AB - Clustering has been one of the most basic and essential problems in unsupervised learning due to various applications in many critical fields. The recently proposed sum-of-norms (SON) model by Pelckmans et al. (in: PASCAL workshop on statistics and optimization of clustering, 2005), Lindsten et al. (in: IEEE statistical signal processing workshop, 2011) and Hocking et al. (in: Proceedings of the 28th international conference on international conference on machine learning, 2011) has received a lot of attention. The advantage of the SON model is the theoretical guarantee in terms of perfect recovery, established by Sun et al. (J Mach Learn Res 22(9):1–32, 2018). It also provides great opportunities for designing efficient algorithms for solving the SON model. The semismooth Newton based augmented Lagrangian method by Sun et al. (2018) has demonstrated its superior performance over the alternating direction method of multipliers and the alternating minimization algorithm. In this paper, we propose a Euclidean distance matrix model based on the SON model. Exact recovery property is achieved under proper assumptions. An efficient majorization penalty algorithm is proposed to solve the resulting model. Extensive numerical experiments are conducted to demonstrate the efficiency of the proposed model and the majorization penalty algorithm.
KW - Clustering
KW - Euclidean distance matrix
KW - Majorization penalty method
KW - Unsupervised learning
UR - https://www.scopus.com/pages/publications/85218352245
U2 - 10.1007/s10957-025-02616-5
DO - 10.1007/s10957-025-02616-5
M3 - Article
AN - SCOPUS:85218352245
SN - 0022-3239
VL - 205
JO - Journal of Optimization Theory and Applications
JF - Journal of Optimization Theory and Applications
IS - 1
M1 - 1
ER -