GMM-ClusterForest: A novel indexing approach for multi-features based similarity search in high-dimensional spaces

Yuchai Wan*, Xiabi Liu, Kunqi Tong, Xue Wei, Yi Wu, Fei Guan, Kunpeng Pang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)

Abstract

This paper proposes a novel clustering based indexing approach called GMM-ClusterForest for supporting multi-features based similarity search in high-dimensional spaces. We fit a Gaussian Mixture Model (GMM) to data through the Expectation-Maximization (EM) algorithm for estimating GMM parameters and the Minimum Description Length (MDL) criterion for selecting GMM structure. Each Gaussian component in the GMM is taken as a cluster center and each data point is assigned to the cluster according to the Bayesian decision rule. By performing this clustering method hierarchically, an index tree is constructed and the corresponding similarity search method is developed for a type of features. Then multi-features based similarity search is fulfilled by fusing the index trees for all the types of features considered. We evaluated the proposed indexing approach through applying it to example-based image retrieval and conducting the experiments on Corel 1000 dataset and self-collected large dataset. The experimental results show that our approach is effective and promising.

Original languageEnglish
Title of host publicationNeural Information Processing - 19th International Conference, ICONIP 2012, Proceedings
Pages210-217
Number of pages8
EditionPART 2
DOIs
Publication statusPublished - 2012
Event19th International Conference on Neural Information Processing, ICONIP 2012 - Doha, Qatar
Duration: 12 Nov 201215 Nov 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume7664 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Conference on Neural Information Processing, ICONIP 2012
Country/TerritoryQatar
CityDoha
Period12/11/1215/11/12

Keywords

  • Clustering
  • Content-Based Image Retrieval (CBIR)
  • Gaussian Mixture Models (GMM)
  • High-dimensional data indexing
  • Similarity search

Fingerprint

Dive into the research topics of 'GMM-ClusterForest: A novel indexing approach for multi-features based similarity search in high-dimensional spaces'. Together they form a unique fingerprint.

Cite this