Reverse engineering aggregation queries

Wei Chit Tan, Meihui Zhang, Hazem Elmeleegy, Divesh Srivastava

Research output: Contribution to journalConference articlepeer-review

26 Citations (Scopus)

Abstract

Query reverse engineering seeks to re-generate the SQL query that produced a given query output table from a given database. In this paper, we solve this problem for OLAP queries with group-by and aggregation. We develop a novel three-phase algorithm named REGAL 1 for this problem. First, based on a lattice graph structure, we identify a set of group-by candidates for the desired query. Second, we apply a set of aggregation constraints that are derived from the properties of aggregate operators at both the table-level and the group-level to discover candidate combinations of group-by columns and aggregations that are consistent with the given query output table. Finally, we find a multi-dimensional filter, i.e., a conjunction of selection predicates over the base table attributes, that is needed to generate the exact query output table. We conduct an extensive experimental study over the TPC-H dataset to demonstrate the effectiveness and efficiency of our proposal.

Original languageEnglish
Pages (from-to)1394-1405
Number of pages12
JournalProceedings of the VLDB Endowment
Volume10
Issue number11
DOIs
Publication statusPublished - 1 Aug 2017
Externally publishedYes
Event43rd International Conference on Very Large Data Bases, VLDB 2017 - Munich, Germany
Duration: 28 Aug 20171 Sept 2017

Fingerprint

Dive into the research topics of 'Reverse engineering aggregation queries'. Together they form a unique fingerprint.

Cite this

Tan, W. C., Zhang, M., Elmeleegy, H., & Srivastava, D. (2017). Reverse engineering aggregation queries. Proceedings of the VLDB Endowment, 10(11), 1394-1405. https://doi.org/10.14778/3137628.3137648