CrowdChart: Crowdsourced Data Extraction from Visualization Charts

Chengliang Chai, Guoliang Li*, Ju Fan, Yuyu Luo

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

7 引用 (Scopus)

摘要

Visualization charts are widely utilized for presenting structured data. Under many circumstances, people want to digitalize the data in the charts collected from various sources (e.g., papers and websites), in oder to further analyze the data or create new charts. However, existing automatic and semi-automatic approaches are not always effective due to the variety of charts. In this paper, we introduce a crowdsourcing approach that leverages human ability to extract data from visualization charts. There are several challenges. The first is how to avoid tedious human interaction with charts and design effective crowdsourcing tasks. Second, it is challenging to evaluate worker's quality for truth inference, because workers may not only provide inaccurate values but also misalign values to wrong data series. Third, to guarantee quality, one may assign a task to many workers, leading to a high crowdsourcing cost. To address these challenges, we design an effective crowdsourcing task scheme that splits a chart into simple micro-tasks. We introduce a novel worker quality model by considering worker's accuracy and task difficulty. We also devise effective task assignment and early-termination mechanisms to save the cost. We evaluate our approach on real-world datasets on real crowdsourced platforms, and the results demonstrate the effectiveness of our method.

源语言英语
页(从-至)3537-3549
页数13
期刊IEEE Transactions on Knowledge and Data Engineering
33
11
DOI
出版状态已出版 - 1 11月 2021
已对外发布

指纹

探究 'CrowdChart: Crowdsourced Data Extraction from Visualization Charts' 的科研主题。它们共同构成独一无二的指纹。

引用此

Chai, C., Li, G., Fan, J., & Luo, Y. (2021). CrowdChart: Crowdsourced Data Extraction from Visualization Charts. IEEE Transactions on Knowledge and Data Engineering, 33(11), 3537-3549. https://doi.org/10.1109/TKDE.2020.2972543