Fast filtering false active subspaces for efficient high dimensional similarity processing

Guoren Wang*, Ge Yu, Junchang Xin, Yuhai Zhao, Ende Zhang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The query space of a similarity query is usually narrowed down by pruning inactive query subspaces which contain no query results and keeping active query subspaces which may contain objects corresponding to the request. However, some active query subspaces may contain no query results at all, those are called false active query subspaces. It is obvious that the performance of query processing degrades in the presence of false active query subspaces. Our experiments show that this problem becomes seriously when the data are high dimensional and the number of accesses to false active subspaces increases as the dimensionality increases. In order to solve this problem, this paper proposes a space mapping approach to reducing such unnecessary accesses. A given query space can be refined by filtering within its mapped space. To do so, a mapping strategy called maxgap is proposed to improve the efficiency of the refinement processing. Based on the mapping strategy, an index structure called MS-tree and algorithms of query processing are presented in this paper. Finally, the performance of MS-tree is compared with that of other competitors in terms of range queries on a real data set.

Original languageEnglish
Pages (from-to)286-294
Number of pages9
JournalScience in China, Series F: Information Sciences
Volume52
Issue number2
DOIs
Publication statusPublished - Feb 2009
Externally publishedYes

Keywords

  • False active subspace
  • High dimensional index
  • Refining processing

Fingerprint

Dive into the research topics of 'Fast filtering false active subspaces for efficient high dimensional similarity processing'. Together they form a unique fingerprint.

Cite this