Incentive-based entity collection using crowdsourcing

Chengliang Chai, Ju Fan*, Guoliang Li

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

25 Citations (Scopus)

Abstract

Crowdsourced entity collection leverages human's ability to collect entities that are missing in a database, which has many real-world applications, such as knowledge base enrichment and enterprise data collection. There are several challenges. First, it is hard to evaluate the workers' quality because a worker's quality depends on not only the correctness of her provided entities but also the distinctness of these entities compared with the collected ones by other workers. Second, crowd workers are likely to provide popular entities and different workers will provide many duplicated entities, leading to a waste of money and low coverage. To address these challenges, we propose an incentive-based crowdsourced entity collection framework CrowdEC that encourages workers to provide more distinct items using an incentive strategy. CrowdEC has fundamental differences from existing crowdsourcing collection methods. One the one hand, CrowdEC proposes a worker model and evaluates a worker's quality based on cross validation and entity checking. CrowdEC devises a worker utility model that considers both worker's quality and entities' distinctness provided by workers. CrowdEC proposes a worker elimination method to block workers with a low utility, which solves the first challenge. On the other hand, CrowdEC proposes an incentive pricing technique that encourages each qualified (i.e., non-eliminated) worker to provide distinct entities rather than duplicates. CrowdEC provides two types of tasks and judiciously assigns workers with appropriate tasks to address the second challenge. We have conducted both real and simulated experiments, and the results show that CrowdEC outperforms existing state-of-The-Art works on both cost and quality.

Original languageEnglish
Title of host publicationProceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages341-352
Number of pages12
ISBN (Electronic)9781538655207
DOIs
Publication statusPublished - 24 Oct 2018
Externally publishedYes
Event34th IEEE International Conference on Data Engineering, ICDE 2018 - Paris, France
Duration: 16 Apr 201819 Apr 2018

Publication series

NameProceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018

Conference

Conference34th IEEE International Conference on Data Engineering, ICDE 2018
Country/TerritoryFrance
CityParis
Period16/04/1819/04/18

Keywords

  • Crowdsourcing
  • Entity collection

Fingerprint

Dive into the research topics of 'Incentive-based entity collection using crowdsourcing'. Together they form a unique fingerprint.

Cite this