High dimensional data indexing technique based on max gap space mapping

Guo Ren Wang*, Jian Mei Huang, Bin Wang, Dong Hong Han, Bai You Qiao, Ge Yu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

In the similarity query processing based on high dimensional indexing, the searching space is usually narrowed down by pruning the inactive subspaces which do not contain any query results. However, among the active subspaces, some of them do not contain any query results at all, those are called false active subspaces. It is obvious that the performance of query processing degrades in the presence of false active subspaces. The problem becomes seriously in the case of high dimensional data. The experiment in this paper shows that the number of accesses to false active subspaces increases as the dimensionality increases. In order to overcome the problem, a space mapping approach is proposed to reduce such unnecessary accesses. For a given query, it can be refined by filtering within its mapped space. A maximal gap space mapping strategy, MaxGapMapping, is proposed to improve the efficiency of the refinement processing. An index structure-MS-tree, the algorithms of construction, and query processing based on this refining method are designed and implemented. Finally, the performance of MS-tree is systemically compared with that of other competitors in terms of range queries based on a real data set.

Original languageEnglish
Pages (from-to)1419-1428
Number of pages10
JournalRuan Jian Xue Bao/Journal of Software
Volume18
Issue number6
DOIs
Publication statusPublished - Jun 2007
Externally publishedYes

Keywords

  • False active subspace
  • High dimensional index
  • Query refining

Fingerprint

Dive into the research topics of 'High dimensional data indexing technique based on max gap space mapping'. Together they form a unique fingerprint.

Cite this