Interactive cleaning for progressive visualization through composite questions

Yuyu Luo, Chengliang Chai, Xuedi Qin, Nan Tang, Guoliang Li

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

27 Citations (Scopus)

Abstract

In this paper, we study the problem of interactive cleaning for progressive visualization (ICPV): Given a bad visualization V , it is to obtain a "cleaned" visualization V whose distance is far from V , under a given (small) budget w.r.t. human cost. In ICPV, a system interacts with a user iteratively. During each iteration, it asks the user a data cleaning question such as "how to clean detected errors x?", and takes value updates from the user to clean V. Conventional wisdom typically picks a single question (e.g., "Are SIGMOD conference and SIGMOD the same?") with the maximum expected benefit in each iteration. We propose to use a composite question - i.e., a group of single questions to be treated as one question - in each iteration (for example, Are SIGMOD conference in t1 and SIGMOD in t2 the same value, and are t1 and t2 duplicates?). A composite question is presented to the user as a small connected graph through a novel GUI that the user can directly operate on. We propose algorithms to select the best composite question in each iteration. Experiments on real-world datasets verify that composite questions are more effective than asking single questions in isolation w.r.t. the human cost.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020
PublisherIEEE Computer Society
Pages733-744
Number of pages12
ISBN (Electronic)9781728129037
DOIs
Publication statusPublished - Apr 2020
Externally publishedYes
Event36th IEEE International Conference on Data Engineering, ICDE 2020 - Dallas, United States
Duration: 20 Apr 202024 Apr 2020

Publication series

NameProceedings - International Conference on Data Engineering
Volume2020-April
ISSN (Print)1084-4627

Conference

Conference36th IEEE International Conference on Data Engineering, ICDE 2020
Country/TerritoryUnited States
CityDallas
Period20/04/2024/04/20

Fingerprint

Dive into the research topics of 'Interactive cleaning for progressive visualization through composite questions'. Together they form a unique fingerprint.

Cite this