TY - GEN
T1 - Ranking desired tuples by database exploration
AU - Qin, Xuedi
AU - Chai, Chengliang
AU - Luo, Yuyu
AU - Zhao, Tianyu
AU - Tang, Nan
AU - Li, Guoliang
AU - Feng, Jianhua
AU - Yu, Xiang
AU - Ouzzani, Mourad
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/4
Y1 - 2021/4
N2 - Database exploration - the problem of finding and ranking desired tuples - is important for data discovery and analysis. Precisely specifying SQL queries is not always feasible in practice, such as "finding and ranking off-road cars based on a combination of Price, Make, Model, Age, and Mileage."- not only due to the query complexity (e.g., which may have many if-then-else, and, or and not logic), but also because the user typically does not have the knowledge of all data instances.We propose DExPlorer, a system for interactive database exploration. DExPlorer offers a simple and user-friendly interface which allows to: (1) confirm whether a tuple is desired or not, and (2) decide whether a tuple is more preferred than another. Behind the scenes, we jointly use multiple ML models to learn from the above two types of user feedback. Moreover, in order to effectively involve users, we carefully select the set of tuples for which we need to solicit feedback. Therefore, we devise question selection algorithms that consider not only the estimated benefit of each tuple, but also the possible partial orders between any two suggested tuples. Experiments on real-world datasets show that DExPlorer is more effective than existing approaches.
AB - Database exploration - the problem of finding and ranking desired tuples - is important for data discovery and analysis. Precisely specifying SQL queries is not always feasible in practice, such as "finding and ranking off-road cars based on a combination of Price, Make, Model, Age, and Mileage."- not only due to the query complexity (e.g., which may have many if-then-else, and, or and not logic), but also because the user typically does not have the knowledge of all data instances.We propose DExPlorer, a system for interactive database exploration. DExPlorer offers a simple and user-friendly interface which allows to: (1) confirm whether a tuple is desired or not, and (2) decide whether a tuple is more preferred than another. Behind the scenes, we jointly use multiple ML models to learn from the above two types of user feedback. Moreover, in order to effectively involve users, we carefully select the set of tuples for which we need to solicit feedback. Therefore, we devise question selection algorithms that consider not only the estimated benefit of each tuple, but also the possible partial orders between any two suggested tuples. Experiments on real-world datasets show that DExPlorer is more effective than existing approaches.
KW - Database Exploration
KW - Ranking
KW - SQL Query
UR - http://www.scopus.com/inward/record.url?scp=85112868610&partnerID=8YFLogxK
U2 - 10.1109/ICDE51399.2021.00186
DO - 10.1109/ICDE51399.2021.00186
M3 - Conference contribution
AN - SCOPUS:85112868610
T3 - Proceedings - International Conference on Data Engineering
SP - 1973
EP - 1978
BT - Proceedings - 2021 IEEE 37th International Conference on Data Engineering, ICDE 2021
PB - IEEE Computer Society
T2 - 37th IEEE International Conference on Data Engineering, ICDE 2021
Y2 - 19 April 2021 through 22 April 2021
ER -