众包数据库综述

Translated title of the contribution: Crowd-Powered Database System: A Survey

Cheng Liang Chai, Guo Liang Li*, Tian Yu Zhao, Yu Yu Luo, Ming He Yu

*Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

1 Citation (Scopus)
Plum Print visual indicator of research metrics
  • Citations
    • Citation Indexes: 2
  • Captures
    • Readers: 6
see details

Abstract

Nowadays, many data management tasks cannot purely rely on machine-based algorithms to be resolved. Therefore, crowdsourcing has attracted the interest of many researchers, which leverages the crowd ability to address the problems that are hard for the computer. Thanks to crowdsourcing platforms, e.g., Amazon Mechanical Turk, we can easily hire hundreds of thousands of workers to resolve these computer-hard tasks. The technical difficulty of crowdsourcing is the complexity of interactions among the above three components, which makes the requesters hard to use and manage their tasks. For example, it is inconvenient for the requester to interact with the crowdsourcing platforms, which require the requesters to set parameters and write codes to display the tasks. Inspired by traditional DBMS, crowdsourcing database systems have been proposed to encapsulate the complexities of interacting with the crowd. The challenges include how to easily use crowdsourcing platforms, how to design query optimization models to optimize crowdsourcing costs, quality and latency and how to support complex crowdsourcing operations. In this paper, we will survey a wide spectrum of existing studies on crowdsourcing database systems. We first give an overview of crowdsourcing, and then introduce the fundamental techniques in designing crowdsourcing databases, including truth inference, task assignment, cost control, etc. In this part, we focus on reviewing sophisticated techniques on improving quality, reducing cost and reducing latency. Next, we will illustrate several popular crowd-powered database systems, including Deco, Qurk, CrowdDB and CDB. We mainly discuss the query language, query optimization models and supporting operations in these databases. Moreover, we review techniques on designing different operators, including selection, join, sort, etc. In this part, we mainly focus on how to optimize the cost, quality and latency for these operators. Finally, we discuss the future works and challenges.

Translated title of the contributionCrowd-Powered Database System: A Survey
Original languageChinese (Traditional)
Pages (from-to)948-972
Number of pages25
JournalJisuanji Xuebao/Chinese Journal of Computers
Volume43
Issue number5
DOIs
Publication statusPublished - 1 May 2020
Externally publishedYes

Fingerprint

Dive into the research topics of 'Crowd-Powered Database System: A Survey'. Together they form a unique fingerprint.

Cite this

Chai, C. L., Li, G. L., Zhao, T. Y., Luo, Y. Y., & Yu, M. H. (2020). 众包数据库综述. Jisuanji Xuebao/Chinese Journal of Computers, 43(5), 948-972. https://doi.org/10.11897/SP.J.1016.2020.00948