TY - GEN
T1 - Local Directional Centrality Clustering Based on K-nearest Neighbor Outlier Detection and Shared Neighborhood Strategy
AU - Liu, Qing
AU - Feng, Zihang
AU - Yan, Liping
AU - Xia, Yuanqing
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Clustering, as an important prerequisite method in data analysis, can uncover potential information in the data and then proceed to the next step of data analysis and processing. The recently proposed boundary-seeking Clustering algorithm using the local Direction Centrality (CDC) is a very effective method for clustering data with heterogeneous density and weak connectivity. However, it still has some shortcomings. On the one hand, the reachable distance is not enough to comprehensively distinguish weak connection situations, which can easily lead to cross cluster connection errors. On the other hand, K-nearest neighbor search is prone to cross cluster search, leading to misjudgment of boundary points and resulting in connection errors. This paper proposes a local direction centrality clustering algorithm based on K-nearest neighbor outlier detection and shared neighborhood strategy (SODCDC) for sparse and weakly connected data. This algorithm uses a K-nearest neighbor outlier detection strategy to relieve K-nearest neighbor cross cluster search and reduces the probability of boundary point misjudgment. At the same time, it uses a shared neighborhood strategy to further prevent cross cluster connections of weakly connected data. Experiments on some datasets have shown that compared to the original algorithm, the proposed algorithm performs better under the commonly used evaluation metrics.
AB - Clustering, as an important prerequisite method in data analysis, can uncover potential information in the data and then proceed to the next step of data analysis and processing. The recently proposed boundary-seeking Clustering algorithm using the local Direction Centrality (CDC) is a very effective method for clustering data with heterogeneous density and weak connectivity. However, it still has some shortcomings. On the one hand, the reachable distance is not enough to comprehensively distinguish weak connection situations, which can easily lead to cross cluster connection errors. On the other hand, K-nearest neighbor search is prone to cross cluster search, leading to misjudgment of boundary points and resulting in connection errors. This paper proposes a local direction centrality clustering algorithm based on K-nearest neighbor outlier detection and shared neighborhood strategy (SODCDC) for sparse and weakly connected data. This algorithm uses a K-nearest neighbor outlier detection strategy to relieve K-nearest neighbor cross cluster search and reduces the probability of boundary point misjudgment. At the same time, it uses a shared neighborhood strategy to further prevent cross cluster connections of weakly connected data. Experiments on some datasets have shown that compared to the original algorithm, the proposed algorithm performs better under the commonly used evaluation metrics.
KW - Direction centrality metric
KW - K-nearest neighbor outlier detection
KW - Shared neighborhood
KW - Sparsity
KW - Weak connection
UR - https://www.scopus.com/pages/publications/85202448577
U2 - 10.1109/DDCLS61622.2024.10606710
DO - 10.1109/DDCLS61622.2024.10606710
M3 - Conference contribution
AN - SCOPUS:85202448577
T3 - Proceedings of 2024 IEEE 13th Data Driven Control and Learning Systems Conference, DDCLS 2024
SP - 484
EP - 489
BT - Proceedings of 2024 IEEE 13th Data Driven Control and Learning Systems Conference, DDCLS 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 13th IEEE Data Driven Control and Learning Systems Conference, DDCLS 2024
Y2 - 17 May 2024 through 19 May 2024
ER -