Abstract
Query reverse engineering seeks to re-generate the SQL query that produced a given query output table from a given database. In this paper, we solve this problem for OLAP queries with group-by and aggregation. We develop a novel three-phase algorithm named REGAL 1 for this problem. First, based on a lattice graph structure, we identify a set of group-by candidates for the desired query. Second, we apply a set of aggregation constraints that are derived from the properties of aggregate operators at both the table-level and the group-level to discover candidate combinations of group-by columns and aggregations that are consistent with the given query output table. Finally, we find a multi-dimensional filter, i.e., a conjunction of selection predicates over the base table attributes, that is needed to generate the exact query output table. We conduct an extensive experimental study over the TPC-H dataset to demonstrate the effectiveness and efficiency of our proposal.
Original language | English |
---|---|
Pages (from-to) | 1394-1405 |
Number of pages | 12 |
Journal | Proceedings of the VLDB Endowment |
Volume | 10 |
Issue number | 11 |
DOIs | |
Publication status | Published - 1 Aug 2017 |
Externally published | Yes |
Event | 43rd International Conference on Very Large Data Bases, VLDB 2017 - Munich, Germany Duration: 28 Aug 2017 → 1 Sept 2017 |