Crowdsourcing-based data extraction from visualization charts

Chengliang Chai, Guoliang Li, Ju Fan, Yuyu Luo

科研成果: 书/报告/会议事项章节会议稿件同行评审

7 引用 (Scopus)

摘要

Visualization charts are widely utilized for presenting structured data. Under many circumstances, people want to explore the data in the charts collected from various sources, such as papers and websites, so as to further analyzing the data or creating new charts. However, the existing automatic and semi-automatic approaches are not always effective due to the variety of charts. In this paper, we introduce a crowdsourcing approach that leverages human ability to extract data from visualization charts. There are several challenges. The first one is how to avoid tedious human interaction with charts and design simple crowdsourcing tasks. Second, it is challenging to evaluate worker's quality for truth inference, because workers may not only provide inaccurate values but also misalign values to wrong data series. To address the challenges, we design an effective crowdsourcing task scheme that splits a chart into simple micro-tasks. We introduce a novel worker quality model by considering worker's accuracy and task difficulty. We also devise an effective early-stopping mechanisms to save the cost. We have conducted experiments on a real crowdsourcing platform, and the results show that our framework outperforms state-of-the-art approaches on both cost and quality.

源语言英语
主期刊名Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020
出版商IEEE Computer Society
1814-1817
页数4
ISBN(电子版)9781728129037
DOI
出版状态已出版 - 4月 2020
已对外发布
活动36th IEEE International Conference on Data Engineering, ICDE 2020 - Dallas, 美国
期限: 20 4月 202024 4月 2020

出版系列

姓名Proceedings - International Conference on Data Engineering
2020-April
ISSN(印刷版)1084-4627

会议

会议36th IEEE International Conference on Data Engineering, ICDE 2020
国家/地区美国
Dallas
时期20/04/2024/04/20

指纹

探究 'Crowdsourcing-based data extraction from visualization charts' 的科研主题。它们共同构成独一无二的指纹。

引用此

Chai, C., Li, G., Fan, J., & Luo, Y. (2020). Crowdsourcing-based data extraction from visualization charts. 在 Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020 (页码 1814-1817). 文章 9101527 (Proceedings - International Conference on Data Engineering; 卷 2020-April). IEEE Computer Society. https://doi.org/10.1109/ICDE48307.2020.00177