TY - GEN
T1 - Efficient peer-to-peer similarity query processing for high-dimensional data
AU - Yuan, Ye
AU - Wang, Guoren
AU - Sun, Yongjiao
PY - 2010
Y1 - 2010
N2 - Objects, such as a digital image, a text document or a DNA sequence are usually represented in a high dimensional feature space. A fundamental issue in (peer-to-peer) P2P systems is to support an efficient similarity search for high-dimensional data in metric spaces. Prior works suffer from some fundamental limitations, such as being not adaptive to a highly dynamic network, poor search efficiency under skewed data scenarios, large maintenance overhead and etc. In this study, we propose an efficient scheme, Dragon, to support P2P similarity search in metric spaces. Dragon achieves the efficiency through the following designs: 1) Dragon is based on our previous designed P2P network, Phoenix, which has the optimal routing efficiency in dynamic scenarios. 2) We design a locality-preserving naming algorithm and a routing tree for each peer in Phoenix to support range queries. A radius-estimated method is proposed to transform a kNN query to a range query. 3) A load-balancing algorithm is given to support strong query processing under skewed data distributions. Extensive experiments verify the superiority of Dragon over existing works.
AB - Objects, such as a digital image, a text document or a DNA sequence are usually represented in a high dimensional feature space. A fundamental issue in (peer-to-peer) P2P systems is to support an efficient similarity search for high-dimensional data in metric spaces. Prior works suffer from some fundamental limitations, such as being not adaptive to a highly dynamic network, poor search efficiency under skewed data scenarios, large maintenance overhead and etc. In this study, we propose an efficient scheme, Dragon, to support P2P similarity search in metric spaces. Dragon achieves the efficiency through the following designs: 1) Dragon is based on our previous designed P2P network, Phoenix, which has the optimal routing efficiency in dynamic scenarios. 2) We design a locality-preserving naming algorithm and a routing tree for each peer in Phoenix to support range queries. A radius-estimated method is proposed to transform a kNN query to a range query. 3) A load-balancing algorithm is given to support strong query processing under skewed data distributions. Extensive experiments verify the superiority of Dragon over existing works.
UR - http://www.scopus.com/inward/record.url?scp=77954286167&partnerID=8YFLogxK
U2 - 10.1109/APWeb.2010.41
DO - 10.1109/APWeb.2010.41
M3 - Conference contribution
AN - SCOPUS:77954286167
SN - 9780769540122
T3 - Advances in Web Technologies and Applications - Proceedings of the 12th Asia-Pacific Web Conference, APWeb 2010
SP - 195
EP - 201
BT - Advances in Web Technologies and Applications - Proceedings of the 12th Asia-Pacific Web Conference, APWeb 2010
T2 - 12th International Asia Pacific Web Conference, APWeb 2010
Y2 - 6 April 2010 through 8 April 2010
ER -