Interactively discovering and ranking desired tuples by data exploration

Xuedi Qin, Chengliang Chai*, Yuyu Luo, Tianyu Zhao, Nan Tang, Guoliang Li*, Jianhua Feng, Xiang Yu, Mourad Ouzzani

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

9 引用 (Scopus)

摘要

Data exploration—the problem of extracting knowledge from database even if we do not know exactly what we are looking for —is important for data discovery and analysis. However, precisely specifying SQL queries is not always practical, such as “finding and ranking off-road cars based on a combination of Price, Make, Model, Age, Mileage, etc”—not only due to the query complexity (e.g.,the queries may have many if-then-else, and, or and not logic), but also because the user typically does not have the knowledge of all data instances (and their variants). We propose DExPlorer, a system for interactive data exploration. From the user perspective, we propose a simple and user-friendly interface, which allows to: (1) confirm whether a tuple is desired or not, and (2) decide whether a tuple is more preferred than another. Behind the scenes, we jointly use multiple ML models to learn from the above two types of user feedback. Moreover, in order to effectively involve human-in-the-loop, we need to select a set of tuples for each user interaction so as to solicit feedback. Therefore, we devise question selection algorithms, which consider not only the estimated benefit of each tuple, but also the possible partial orders between any two suggested tuples. Experiments on real-world datasets show that DExPlorer outperforms existing approaches in effectiveness.

源语言英语
页(从-至)753-777
页数25
期刊VLDB Journal
31
4
DOI
出版状态已出版 - 7月 2022
已对外发布

指纹

探究 'Interactively discovering and ranking desired tuples by data exploration' 的科研主题。它们共同构成独一无二的指纹。

引用此

Qin, X., Chai, C., Luo, Y., Zhao, T., Tang, N., Li, G., Feng, J., Yu, X., & Ouzzani, M. (2022). Interactively discovering and ranking desired tuples by data exploration. VLDB Journal, 31(4), 753-777. https://doi.org/10.1007/s00778-021-00714-0