TY - JOUR
T1 - DSP-TMM
T2 - A Robust Cluster Analysis Method Based on Diversity Self-Paced T-Mixture Model
AU - Pan, Limin
AU - Qin, Xiaonan
AU - Luo, Senlin
N1 - Publisher Copyright:
© 2020 Journal of Beijing Institute of Technology
PY - 2020/12
Y1 - 2020/12
N2 - In order to implement the robust cluster analysis, solve the problem that the outliers in the data will have a serious disturbance to the probability density parameter estimation, and therefore affect the accuracy of clustering, a robust cluster analysis method is proposed which is based on the diversity self-paced t-mixture model. This model firstly adopts the t-distribution as the sub-model which tail is easily controllable. On this basis, it utilizes the entropy penalty expectation conditional maximal algorithm as a pre-clustering step to estimate the initial parameters. After that, this model introduces l2,1-norm as a self-paced regularization term and developes a new ECM optimization algorithm, in order to select high confidence samples from each component in training. Finally, experimental results on several real-world datasets in different noise environments show that the diversity self-paced t-mixture model outperforms the state-of-the-art clustering methods. It provides significant guidance for the construction of the robust mixture distribution model.
AB - In order to implement the robust cluster analysis, solve the problem that the outliers in the data will have a serious disturbance to the probability density parameter estimation, and therefore affect the accuracy of clustering, a robust cluster analysis method is proposed which is based on the diversity self-paced t-mixture model. This model firstly adopts the t-distribution as the sub-model which tail is easily controllable. On this basis, it utilizes the entropy penalty expectation conditional maximal algorithm as a pre-clustering step to estimate the initial parameters. After that, this model introduces l2,1-norm as a self-paced regularization term and developes a new ECM optimization algorithm, in order to select high confidence samples from each component in training. Finally, experimental results on several real-world datasets in different noise environments show that the diversity self-paced t-mixture model outperforms the state-of-the-art clustering methods. It provides significant guidance for the construction of the robust mixture distribution model.
KW - Cluster analysis
KW - Gaussian mixture model
KW - Initialization
KW - Self-paced learning
KW - T-distribution mixture model
UR - http://www.scopus.com/inward/record.url?scp=85098695837&partnerID=8YFLogxK
U2 - 10.15918/j.jbit1004-0579.20070
DO - 10.15918/j.jbit1004-0579.20070
M3 - Article
AN - SCOPUS:85098695837
SN - 1004-0579
VL - 29
SP - 531
EP - 543
JO - Journal of Beijing Institute of Technology (English Edition)
JF - Journal of Beijing Institute of Technology (English Edition)
IS - 4
ER -