CDB: Optimizing queries with crowd-based selections and joins

Guoliang Li, Chengliang Chai, Ju Fan, Xueping Weng, Jian Li, Yudian Zheng, Yuanbing Li, Xiang Yu, Xiaohang Zhang, Haitao Yuan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

68 Citations (Scopus)

Abstract

Crowdsourcing database systems have been proposed to leverage crowd-powered operations to encapsulate the complexities of interacting with the crowd. Existing systems suffer from two major limitations. Firstly, in order to optimize a query, they often adopt the traditional tree model to select an optimized table-level join order. However, the tree model provides a coarse-grained optimization, which generates the same order for different joined tuples and limits the optimization potential that different joined tuples can be optimized by different orders. Secondly, they mainly focus on optimizing the monetary cost. In fact, there are three optimization goals (i.e., smaller monetary cost, lower latency, and higher quality) in crowdsourcing, and it calls for a system to enable multi-goal optimization. To address the limitations, we develop a crowd-powered database system CDB that supports crowd-based query optimizations, with focus on join and selection. CDB has fundamental differences from existing systems. First, CDB employs a graph-based query model that provides more fine-grained query optimization. Second, CDB adopts a unified framework to perform the multi-goal optimization based on the graph model. We have implemented our system and deployed it on AMT, CrowdFlower and ChinaCrowd. We have also created a benchmark for evaluating crowd-powered databases. We have conducted both simulated and real experiments, and the experimental results demonstrate the performance superiority of CDB on cost, latency and quality.

Original languageEnglish
Title of host publicationSIGMOD 2017 - Proceedings of the 2017 ACM International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages1463-1478
Number of pages16
ISBN (Electronic)9781450341974
DOIs
Publication statusPublished - 9 May 2017
Externally publishedYes
Event2017 ACM SIGMOD International Conference on Management of Data, SIGMOD 2017 - Chicago, United States
Duration: 14 May 201719 May 2017

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
VolumePart F127746
ISSN (Print)0730-8078

Conference

Conference2017 ACM SIGMOD International Conference on Management of Data, SIGMOD 2017
Country/TerritoryUnited States
CityChicago
Period14/05/1719/05/17

Keywords

  • Crowd-based join
  • Crowd-based selection
  • Crowdsourcing
  • Crowdsourcing optimization

Fingerprint

Dive into the research topics of 'CDB: Optimizing queries with crowd-based selections and joins'. Together they form a unique fingerprint.

Cite this