Abstract
In the big data era, the efficient indexing of gradually increasing databases is becoming vitally important for information retrieval. To incrementally adapt to changes of databases, in this paper we propose a novel clustering based dynamic indexing and retrieval approach. The tree-like indexing structure, termed as CD-Tree, updates the structure with constant insertion of data, keeping the tree in consistent with the newest database. The nodes in the CD-Tree are fitted by Gaussian Mixture Models, based on which we design the efficient updating algorithm. The similarity retrieval method utilizing the CD-Tree is further presented, combining one-way search and backtracking strategy to gain good retrieval accuracy and efficiency. We applied the CD-Tree to example-based image retrieval. The experimental results confirm that our approach is effective and promising.
Original language | English |
---|---|
Pages (from-to) | 243-261 |
Number of pages | 19 |
Journal | Intelligent Data Analysis |
Volume | 21 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2017 |
Keywords
- Gaussian Mixture Models (GMM)
- Information retrieval
- data stream
- dynamic indexing
- tree-like indexing structure