TY - GEN
T1 - Interactively Discovering and Ranking Desired Tuples without Writing SQL Queries
AU - Qin, Xuedi
AU - Chai, Chengliang
AU - Luo, Yuyu
AU - Tang, Nan
AU - Li, Guoliang
N1 - Publisher Copyright:
© 2020 Association for Computing Machinery.
PY - 2020/6/14
Y1 - 2020/6/14
N2 - The very first step of many data analytics is to find and (possibly) rank desired tuples, typically through writing SQL queries - this is feasible only for data experts who can write SQL queries and know the data very well. Unfortunately, in practice, the queries might be complicated (for example, "find and rank good off-road cars based on a combination of Price, Make, Model, Age, Mileage, and so on" is complicated because it contains many if-then-else, and, or and not logic) such that even data experts cannot precisely specify SQL queries; and the data might be unknown, which is common in data discovery that one tries to discover desired data from a data lake. Naturally, a system that can help users to discover and rank desired tuples without writing SQL queries is needed. We propose to demonstrate such as a system, namely DExPlorer. To use DExPlorer for data exploration, the user only needs to interactively perform two simple operations over a set of system provided tuples: (1) annotate which tuples are desired (i.e., true labels) or not (i.e., false labels), and (2) annotate whether a tuple is more preferred than another one (i.e., partial orders or ranked lists). We will show that DExPlorer can find user's desired tuples and rank them in a few interactions, even for complicated queries.
AB - The very first step of many data analytics is to find and (possibly) rank desired tuples, typically through writing SQL queries - this is feasible only for data experts who can write SQL queries and know the data very well. Unfortunately, in practice, the queries might be complicated (for example, "find and rank good off-road cars based on a combination of Price, Make, Model, Age, Mileage, and so on" is complicated because it contains many if-then-else, and, or and not logic) such that even data experts cannot precisely specify SQL queries; and the data might be unknown, which is common in data discovery that one tries to discover desired data from a data lake. Naturally, a system that can help users to discover and rank desired tuples without writing SQL queries is needed. We propose to demonstrate such as a system, namely DExPlorer. To use DExPlorer for data exploration, the user only needs to interactively perform two simple operations over a set of system provided tuples: (1) annotate which tuples are desired (i.e., true labels) or not (i.e., false labels), and (2) annotate whether a tuple is more preferred than another one (i.e., partial orders or ranked lists). We will show that DExPlorer can find user's desired tuples and rank them in a few interactions, even for complicated queries.
KW - SQL query
KW - data exploration
KW - database
KW - rank
UR - http://www.scopus.com/inward/record.url?scp=85086237447&partnerID=8YFLogxK
U2 - 10.1145/3318464.3384695
DO - 10.1145/3318464.3384695
M3 - Conference contribution
AN - SCOPUS:85086237447
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 2745
EP - 2748
BT - SIGMOD 2020 - Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
PB - Association for Computing Machinery
T2 - 2020 ACM SIGMOD International Conference on Management of Data, SIGMOD 2020
Y2 - 14 June 2020 through 19 June 2020
ER -