ICGT: A novel incremental clustering approach based on GMM tree

Yuchai Wan, Xiabi Liu*, Yi Wu, Lunhao Guo, Qiming Chen, Murong Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

11 Citations (Scopus)

Abstract

Streaming data presents new challenges to data mining algorithms. To conduct data clustering on the streaming data, this paper proposes a novel incremental clustering approach utilizing Gaussian Mixture Model (GMM), termed as ICGT (Incremental Construction of GMM Tree). The ICGT creates and dynamically adjusts a GMM tree consistent to the sequentially presented data. Each leaf node in the tree corresponds to a dense Gaussian distribution and each non-leaf node to a GMM. To update the GMM tree for insertion of the newly arrived data points, we introduce the definitions of node connectivity and connected subsets, and present the tree update algorithm. We further develop a clustering evaluation criterion and search strategy to determine the final partition of the data set based on the constructed GMM tree. We evaluated the proposed approach on synthetic and real-world data sets and compared ICGT with other incremental and static clustering methods. The experimental results confirm that our approach is effective and promising.

Original languageEnglish
Pages (from-to)71-86
Number of pages16
JournalData and Knowledge Engineering
Volume117
DOIs
Publication statusPublished - Sept 2018

Keywords

  • Gaussian mixture model (GMM)
  • Incremental data clustering
  • Streaming data
  • Tree structure

Fingerprint

Dive into the research topics of 'ICGT: A novel incremental clustering approach based on GMM tree'. Together they form a unique fingerprint.

Cite this