TY - JOUR
T1 - Skyline-join query processing in distributed databases
AU - Bai, Mei
AU - Xin, Junchang
AU - Wang, Guoren
AU - Zimmermann, Roger
AU - Wang, Xite
N1 - Publisher Copyright:
© 2016, Higher Education Press and Springer-Verlag Berlin Heidelberg.
PY - 2016/4/1
Y1 - 2016/4/1
N2 - The skyline-join operator, as an important variant of skylines, plays an important role in multi-criteria decision making problems. However, as the data scale increases, previous methods of skyline-join queries cannot be applied to new applications. Therefore, in this paper, it is the first attempt to propose a scalable method to process skyline-join queries in distributed databases. First, a tailored distributed framework is presented to facilitate the computation of skyline-join queries. Second, the distributed skyline-join query algorithm (DSJQ) is designed to process skyline-join queries. DSJQ contains two phases. In the first phase, two filtering strategies are used to filter out unpromising tuples from the original tables. The remaining tuples are transmitted to the corresponding data nodes according a partition function, which can guarantee that the tuples with the same join value are transferred to the same node. In the second phase, we design a scheduling plan based on rotations to calculate the final skyline-join result. The scheduling plan can ensure that calculations are equally assigned to all the data nodes, and the calculations on each data node can be processed in parallel without creating a bottleneck node. Finally, the effectiveness of DSJQ is evaluated through a series of experiments.
AB - The skyline-join operator, as an important variant of skylines, plays an important role in multi-criteria decision making problems. However, as the data scale increases, previous methods of skyline-join queries cannot be applied to new applications. Therefore, in this paper, it is the first attempt to propose a scalable method to process skyline-join queries in distributed databases. First, a tailored distributed framework is presented to facilitate the computation of skyline-join queries. Second, the distributed skyline-join query algorithm (DSJQ) is designed to process skyline-join queries. DSJQ contains two phases. In the first phase, two filtering strategies are used to filter out unpromising tuples from the original tables. The remaining tuples are transmitted to the corresponding data nodes according a partition function, which can guarantee that the tuples with the same join value are transferred to the same node. In the second phase, we design a scheduling plan based on rotations to calculate the final skyline-join result. The scheduling plan can ensure that calculations are equally assigned to all the data nodes, and the calculations on each data node can be processed in parallel without creating a bottleneck node. Finally, the effectiveness of DSJQ is evaluated through a series of experiments.
KW - distributed
KW - filtering strategy
KW - rotation
KW - scheduling plan
KW - skyline-join
UR - https://www.scopus.com/pages/publications/84959458724
U2 - 10.1007/s11704-015-4534-y
DO - 10.1007/s11704-015-4534-y
M3 - Article
AN - SCOPUS:84959458724
SN - 2095-2228
VL - 10
SP - 330
EP - 352
JO - Frontiers of Computer Science
JF - Frontiers of Computer Science
IS - 2
ER -