TY - GEN
T1 - Incentive-based entity collection using crowdsourcing
AU - Chai, Chengliang
AU - Fan, Ju
AU - Li, Guoliang
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/10/24
Y1 - 2018/10/24
N2 - Crowdsourced entity collection leverages human's ability to collect entities that are missing in a database, which has many real-world applications, such as knowledge base enrichment and enterprise data collection. There are several challenges. First, it is hard to evaluate the workers' quality because a worker's quality depends on not only the correctness of her provided entities but also the distinctness of these entities compared with the collected ones by other workers. Second, crowd workers are likely to provide popular entities and different workers will provide many duplicated entities, leading to a waste of money and low coverage. To address these challenges, we propose an incentive-based crowdsourced entity collection framework CrowdEC that encourages workers to provide more distinct items using an incentive strategy. CrowdEC has fundamental differences from existing crowdsourcing collection methods. One the one hand, CrowdEC proposes a worker model and evaluates a worker's quality based on cross validation and entity checking. CrowdEC devises a worker utility model that considers both worker's quality and entities' distinctness provided by workers. CrowdEC proposes a worker elimination method to block workers with a low utility, which solves the first challenge. On the other hand, CrowdEC proposes an incentive pricing technique that encourages each qualified (i.e., non-eliminated) worker to provide distinct entities rather than duplicates. CrowdEC provides two types of tasks and judiciously assigns workers with appropriate tasks to address the second challenge. We have conducted both real and simulated experiments, and the results show that CrowdEC outperforms existing state-of-The-Art works on both cost and quality.
AB - Crowdsourced entity collection leverages human's ability to collect entities that are missing in a database, which has many real-world applications, such as knowledge base enrichment and enterprise data collection. There are several challenges. First, it is hard to evaluate the workers' quality because a worker's quality depends on not only the correctness of her provided entities but also the distinctness of these entities compared with the collected ones by other workers. Second, crowd workers are likely to provide popular entities and different workers will provide many duplicated entities, leading to a waste of money and low coverage. To address these challenges, we propose an incentive-based crowdsourced entity collection framework CrowdEC that encourages workers to provide more distinct items using an incentive strategy. CrowdEC has fundamental differences from existing crowdsourcing collection methods. One the one hand, CrowdEC proposes a worker model and evaluates a worker's quality based on cross validation and entity checking. CrowdEC devises a worker utility model that considers both worker's quality and entities' distinctness provided by workers. CrowdEC proposes a worker elimination method to block workers with a low utility, which solves the first challenge. On the other hand, CrowdEC proposes an incentive pricing technique that encourages each qualified (i.e., non-eliminated) worker to provide distinct entities rather than duplicates. CrowdEC provides two types of tasks and judiciously assigns workers with appropriate tasks to address the second challenge. We have conducted both real and simulated experiments, and the results show that CrowdEC outperforms existing state-of-The-Art works on both cost and quality.
KW - Crowdsourcing
KW - Entity collection
UR - http://www.scopus.com/inward/record.url?scp=85057122547&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2018.00039
DO - 10.1109/ICDE.2018.00039
M3 - Conference contribution
AN - SCOPUS:85057122547
T3 - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
SP - 341
EP - 352
BT - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 34th IEEE International Conference on Data Engineering, ICDE 2018
Y2 - 16 April 2018 through 19 April 2018
ER -