Abstract
Top-k queries in distributed databases have been studied widely in recent years. There exists an inherent uncertainty on the data objects due to imprecise measurements and network delays. In this paper, based on horizontally distributed data among peers, we propose an efficient approach of processing uncertain top-k queries in P2P networks. Firstly, we construct a distributed index using Quad-tree, and based on the index, propose a spatial pruning algorithm. Secondly, we propose the upper bound of top-k probabilistic according to the relationship between local top-k probabilities and global top-k probabilities. We also propose the lower bound of top-k probabilities according to the relationship between skyline probabilities and top-k probabilities. Using the two probabilistic pruning algorithms, we can further reduce computation costs and network overhead of top-k queries, and further reduce the number of candidate sets. Finally, we develop a sampling algorithm to estimate top-k probabilities of candidates. Extensive experiments are conducted to verify the effectiveness and efficiency of the proposed methods.
Original language | English |
---|---|
Pages (from-to) | 2155-2164 |
Number of pages | 10 |
Journal | Jisuanji Xuebao/Chinese Journal of Computers |
Volume | 34 |
Issue number | 11 |
DOIs | |
Publication status | Published - Nov 2011 |
Externally published | Yes |
Keywords
- P2P
- Quad-tree
- Skyline probability
- Top-k query
- Uncertain data