A novel open-set clustering algorithm

Qi Li, Guochen Yan, Shuliang Wang*, Boxiang Zhao

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

Plum Print visual indicator of research metrics
  • Captures
    • Readers: 3
  • Social Media
    • Shares, Likes & Comments: 5
see details

摘要

DOS (Delta Open Set) is an interesting clustering algorithm that transforms cluster identification into set identification. It identifies the objects whose neighborhoods coincide as an open-set, and an open-set corresponds to a cluster. However, once the dataset is complex, DOS tends to identify overlapping clusters as one category. We believe the main reason is that DOS unifies the neighborhood radius by a specific function, resulting in the inability to cope with various object distributions. To improve DOS, we propose DOS-IN (Irregular Neighborhoods). Specifically, DOS-IN generates irregular neighborhoods based on the similarity between objects to self-adapt to diverse object distributions. As a result, DOS-IN not only can accurately distinguish overlapping clusters but also has fewer input parameters. In addition, DOS-IN introduces the small-cluster merging mechanism to address the shortcoming of DOS in recognizing Gaussian clusters. The experimental results show that DOS-IN is completely superior to DOS. Compared with baseline methods, DOS-IN outperforms them on 7 out of 10 datasets, with at least 13.8% (NMI) and 2.4% (RI) improvement in accuracy. The code of DOS-IN is available at https://github.com/Youth-49/2023-DOS-IN.

源语言英语
文章编号119561
期刊Information Sciences
648
DOI
出版状态已出版 - 11月 2023

指纹

探究 'A novel open-set clustering algorithm' 的科研主题。它们共同构成独一无二的指纹。

引用此

Li, Q., Yan, G., Wang, S., & Zhao, B. (2023). A novel open-set clustering algorithm. Information Sciences, 648, 文章 119561. https://doi.org/10.1016/j.ins.2023.119561