HIBOG: Improving the clustering accuracy by ameliorating dataset with gravitation

Qi Li, Shuliang Wang*, Chuanfeng Zhao, Boxiang Zhao, Xin Yue, Jing Geng

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

11 Citations (Scopus)

Abstract

Clustering is an important technology applied in many fields. Most researchers focus on only clustering algorithms when they want more accurate results. However, this is not an optimal strategy because each algorithm has its unique advantages and disadvantages. Furthermore, a given algorithm cannot get satisfactory results on all datasets. In this paper, focusing on datasets, a method called HIBOG is proposed to improve the clustering accuracy by ameliorating datasets with gravitation. HIBOG can help many clustering algorithms acquire better results on more datasets by ameliorating datasets so that similar objects get closer and dissimilar objects separate further apart. As a result, ameliorated datasets are friendlier to many clustering algorithms than original datasets. Though datasets are diverse, HIBOG can cope with the diversity to some extent due to its robustness to high dimensional datasets, Gaussian distribution datasets, shaped datasets, and datasets with high overlap clusters. We have conducted numerous experiments on real-world datasets to verify the effectiveness of HIBOG. The experiments demonstrated that HIBOG successfully improves the accuracy of different clustering algorithms, and accuracy increases by an average of 113.4% (except maximum and minimum). Moreover, compared with other similar methods, HIBOG improves much higher clustering accuracy and dramatically shortens the running time. At the same time, we conducted 360 experiments, each of which selected different parameter values. The experiments show that most values enable HIBOG to ameliorate datasets, and HIBOG has strong robustness to the parameter adjustment.

Original languageEnglish
Pages (from-to)41-56
Number of pages16
JournalInformation Sciences
Volume550
DOIs
Publication statusPublished - Mar 2021

Keywords

  • Clustering
  • Good datasets
  • Gravitation
  • Improving accuracy

Fingerprint

Dive into the research topics of 'HIBOG: Improving the clustering accuracy by ameliorating dataset with gravitation'. Together they form a unique fingerprint.

Cite this