TY - GEN
T1 - CDB
T2 - 2017 ACM SIGMOD International Conference on Management of Data, SIGMOD 2017
AU - Li, Guoliang
AU - Chai, Chengliang
AU - Fan, Ju
AU - Weng, Xueping
AU - Li, Jian
AU - Zheng, Yudian
AU - Li, Yuanbing
AU - Yu, Xiang
AU - Zhang, Xiaohang
AU - Yuan, Haitao
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017/5/9
Y1 - 2017/5/9
N2 - Crowdsourcing database systems have been proposed to leverage crowd-powered operations to encapsulate the complexities of interacting with the crowd. Existing systems suffer from two major limitations. Firstly, in order to optimize a query, they often adopt the traditional tree model to select an optimized table-level join order. However, the tree model provides a coarse-grained optimization, which generates the same order for different joined tuples and limits the optimization potential that different joined tuples can be optimized by different orders. Secondly, they mainly focus on optimizing the monetary cost. In fact, there are three optimization goals (i.e., smaller monetary cost, lower latency, and higher quality) in crowdsourcing, and it calls for a system to enable multi-goal optimization. To address the limitations, we develop a crowd-powered database system CDB that supports crowd-based query optimizations, with focus on join and selection. CDB has fundamental differences from existing systems. First, CDB employs a graph-based query model that provides more fine-grained query optimization. Second, CDB adopts a unified framework to perform the multi-goal optimization based on the graph model. We have implemented our system and deployed it on AMT, CrowdFlower and ChinaCrowd. We have also created a benchmark for evaluating crowd-powered databases. We have conducted both simulated and real experiments, and the experimental results demonstrate the performance superiority of CDB on cost, latency and quality.
AB - Crowdsourcing database systems have been proposed to leverage crowd-powered operations to encapsulate the complexities of interacting with the crowd. Existing systems suffer from two major limitations. Firstly, in order to optimize a query, they often adopt the traditional tree model to select an optimized table-level join order. However, the tree model provides a coarse-grained optimization, which generates the same order for different joined tuples and limits the optimization potential that different joined tuples can be optimized by different orders. Secondly, they mainly focus on optimizing the monetary cost. In fact, there are three optimization goals (i.e., smaller monetary cost, lower latency, and higher quality) in crowdsourcing, and it calls for a system to enable multi-goal optimization. To address the limitations, we develop a crowd-powered database system CDB that supports crowd-based query optimizations, with focus on join and selection. CDB has fundamental differences from existing systems. First, CDB employs a graph-based query model that provides more fine-grained query optimization. Second, CDB adopts a unified framework to perform the multi-goal optimization based on the graph model. We have implemented our system and deployed it on AMT, CrowdFlower and ChinaCrowd. We have also created a benchmark for evaluating crowd-powered databases. We have conducted both simulated and real experiments, and the experimental results demonstrate the performance superiority of CDB on cost, latency and quality.
KW - Crowd-based join
KW - Crowd-based selection
KW - Crowdsourcing
KW - Crowdsourcing optimization
UR - http://www.scopus.com/inward/record.url?scp=85021192548&partnerID=8YFLogxK
U2 - 10.1145/3035918.3064036
DO - 10.1145/3035918.3064036
M3 - Conference contribution
AN - SCOPUS:85021192548
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 1463
EP - 1478
BT - SIGMOD 2017 - Proceedings of the 2017 ACM International Conference on Management of Data
PB - Association for Computing Machinery
Y2 - 14 May 2017 through 19 May 2017
ER -