Crowdsourcing database systems: Overview and challenges

Chengliang Chai; Ju Fan; Guoliang Li; Jiannan Wang; Yudian Zheng

doi:10.1109/ICDE.2019.00237

Crowdsourcing database systems: Overview and challenges

Chengliang Chai, Ju Fan, Guoliang Li, Jiannan Wang, Yudian Zheng

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

30 引用（Scopus）

摘要

Many data management and analytics tasks, such as entity resolution, cannot be solely addressed by automated processes. Crowdsourcing is an effective way to harness the human cognitive ability to process these computer-hard tasks. Thanks to public crowdsourcing platforms, e.g., Amazon Mechanical Turk and CrowdFlower, we can easily involve hundreds of thousands of ordinary workers (i.e., the crowd) to address these computer-hard tasks. However it is rather inconvenient to interact with the crowdsourcing platforms, because the platforms require one to set parameters and even write codes. Inspired by traditional DBMS, crowdsourcing database systems have been proposed and widely studied to encapsulate the complexities of interacting with the crowd. In this tutorial, we will survey and synthesize a wide spectrum of existing studies on crowdsourcing database systems. We first give an overview of crowdsourcing, and then summarize the fundamental techniques in designing crowdsourcing databases, including task design, truth inference, task assignment, answer reasoning and latency reduction. Next we review the techniques on designing crowdsourced operators, including selection, join, sort, top-k, max/min, count, collect, and fill. Finally, we discuss the emerging challenges.

源语言	英语
主期刊名	Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019
出版商	IEEE Computer Society
页	2052-2055
页数	4
ISBN（电子版）	9781538674741
DOI	https://doi.org/10.1109/ICDE.2019.00237
出版状态	已出版 - 4月 2019
已对外发布	是
活动	35th IEEE International Conference on Data Engineering, ICDE 2019 - Macau, 中国期限: 8 4月 2019 → 11 4月 2019

出版系列

姓名	Proceedings - International Conference on Data Engineering
卷	2019-April
ISSN（印刷版）	1084-4627

会议

会议	35th IEEE International Conference on Data Engineering, ICDE 2019
国家/地区	中国
市	Macau
时期	8/04/19 → 11/04/19

访问文件

10.1109/ICDE.2019.00237

其它文件与链接

链接到 Scopus 的出版物

引用此

Chai, C., Fan, J., Li, G., Wang, J., & Zheng, Y. (2019). Crowdsourcing database systems: Overview and challenges. 在 Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019 (页码 2052-2055). 文章 8731525 (Proceedings - International Conference on Data Engineering; 卷 2019-April). IEEE Computer Society. https://doi.org/10.1109/ICDE.2019.00237

@inproceedings{949d90f6cc454dd6b6a6ae818e4d20e0,

title = "Crowdsourcing database systems: Overview and challenges",

abstract = "Many data management and analytics tasks, such as entity resolution, cannot be solely addressed by automated processes. Crowdsourcing is an effective way to harness the human cognitive ability to process these computer-hard tasks. Thanks to public crowdsourcing platforms, e.g., Amazon Mechanical Turk and CrowdFlower, we can easily involve hundreds of thousands of ordinary workers (i.e., the crowd) to address these computer-hard tasks. However it is rather inconvenient to interact with the crowdsourcing platforms, because the platforms require one to set parameters and even write codes. Inspired by traditional DBMS, crowdsourcing database systems have been proposed and widely studied to encapsulate the complexities of interacting with the crowd. In this tutorial, we will survey and synthesize a wide spectrum of existing studies on crowdsourcing database systems. We first give an overview of crowdsourcing, and then summarize the fundamental techniques in designing crowdsourcing databases, including task design, truth inference, task assignment, answer reasoning and latency reduction. Next we review the techniques on designing crowdsourced operators, including selection, join, sort, top-k, max/min, count, collect, and fill. Finally, we discuss the emerging challenges.",

keywords = "Crowdsourcing, Database",

author = "Chengliang Chai and Ju Fan and Guoliang Li and Jiannan Wang and Yudian Zheng",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 35th IEEE International Conference on Data Engineering, ICDE 2019 ; Conference date: 08-04-2019 Through 11-04-2019",

year = "2019",

month = apr,

doi = "10.1109/ICDE.2019.00237",

language = "English",

series = "Proceedings - International Conference on Data Engineering",

publisher = "IEEE Computer Society",

pages = "2052--2055",

booktitle = "Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019",

address = "United States",

}

Chai, C, Fan, J, Li, G, Wang, J & Zheng, Y 2019, Crowdsourcing database systems: Overview and challenges. 在 Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019., 8731525, Proceedings - International Conference on Data Engineering, 卷 2019-April, IEEE Computer Society, 页码 2052-2055, 35th IEEE International Conference on Data Engineering, ICDE 2019, Macau, 中国, 8/04/19. https://doi.org/10.1109/ICDE.2019.00237

Crowdsourcing database systems: Overview and challenges. / Chai, Chengliang; Fan, Ju; Li, Guoliang 等.
Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019. IEEE Computer Society, 2019. 页码 2052-2055 8731525 (Proceedings - International Conference on Data Engineering; 卷 2019-April).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Crowdsourcing database systems

T2 - 35th IEEE International Conference on Data Engineering, ICDE 2019

AU - Chai, Chengliang

AU - Fan, Ju

AU - Li, Guoliang

AU - Wang, Jiannan

AU - Zheng, Yudian

PY - 2019/4

Y1 - 2019/4

N2 - Many data management and analytics tasks, such as entity resolution, cannot be solely addressed by automated processes. Crowdsourcing is an effective way to harness the human cognitive ability to process these computer-hard tasks. Thanks to public crowdsourcing platforms, e.g., Amazon Mechanical Turk and CrowdFlower, we can easily involve hundreds of thousands of ordinary workers (i.e., the crowd) to address these computer-hard tasks. However it is rather inconvenient to interact with the crowdsourcing platforms, because the platforms require one to set parameters and even write codes. Inspired by traditional DBMS, crowdsourcing database systems have been proposed and widely studied to encapsulate the complexities of interacting with the crowd. In this tutorial, we will survey and synthesize a wide spectrum of existing studies on crowdsourcing database systems. We first give an overview of crowdsourcing, and then summarize the fundamental techniques in designing crowdsourcing databases, including task design, truth inference, task assignment, answer reasoning and latency reduction. Next we review the techniques on designing crowdsourced operators, including selection, join, sort, top-k, max/min, count, collect, and fill. Finally, we discuss the emerging challenges.

AB - Many data management and analytics tasks, such as entity resolution, cannot be solely addressed by automated processes. Crowdsourcing is an effective way to harness the human cognitive ability to process these computer-hard tasks. Thanks to public crowdsourcing platforms, e.g., Amazon Mechanical Turk and CrowdFlower, we can easily involve hundreds of thousands of ordinary workers (i.e., the crowd) to address these computer-hard tasks. However it is rather inconvenient to interact with the crowdsourcing platforms, because the platforms require one to set parameters and even write codes. Inspired by traditional DBMS, crowdsourcing database systems have been proposed and widely studied to encapsulate the complexities of interacting with the crowd. In this tutorial, we will survey and synthesize a wide spectrum of existing studies on crowdsourcing database systems. We first give an overview of crowdsourcing, and then summarize the fundamental techniques in designing crowdsourcing databases, including task design, truth inference, task assignment, answer reasoning and latency reduction. Next we review the techniques on designing crowdsourced operators, including selection, join, sort, top-k, max/min, count, collect, and fill. Finally, we discuss the emerging challenges.

KW - Crowdsourcing

KW - Database

UR - http://www.scopus.com/inward/record.url?scp=85067919534&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2019.00237

DO - 10.1109/ICDE.2019.00237

M3 - Conference contribution

AN - SCOPUS:85067919534

T3 - Proceedings - International Conference on Data Engineering

SP - 2052

EP - 2055

BT - Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019

PB - IEEE Computer Society

Y2 - 8 April 2019 through 11 April 2019

ER -

Crowdsourcing database systems: Overview and challenges

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此