Reverse Engineering Aggregation Queries
@article{Tan2017ReverseEA, title={Reverse Engineering Aggregation Queries}, author={Wei Chit Tan and Meihui Zhang and Hazem Elmeleegy and Divesh Srivastava}, journal={Proc. VLDB Endow.}, year={2017}, volume={10}, pages={1394-1405} }
Query reverse engineering seeks to re-generate the SQL query that produced a given query output table from a given database. In this paper, we solve this problem for OLAP queries with group-by and aggregation. We develop a novel three-phase algorithm named REGAL 1 for this problem. First, based on a lattice graph structure, we identify a set of group-by candidates for the desired query. Second, we apply a set of aggregation constraints that are derived from the properties of aggregate operators…
Figures and Tables from this paper
22 Citations
Efficient Query Reverse Engineering Using Table Fragments
- Computer ScienceDASFAA
- 2020
This paper proposes a novel query-centric approach consisting of table partitioning, precomputation, and indexing techniques that significantly outperforms the state-of-the-art solution for PJ\(^+\) queries by an average improvement factor of 120.
REGAL+: Reverse Engineering SPJA Queries
- Computer ScienceProc. VLDB Endow.
- 2018
This work presents a framework called REGAL, which builds upon, and extends prior work to enable the discovery of Select-Project-Join-Aggregation (SPJA) queries over arbitrary schemas, and presents a demonstration of reverse engineering SPJA Queries.
Efficient Query Reverse Engineering for Joins and OLAP-Style Aggregations
- Computer ScienceAPWeb/WAIM
- 2018
This paper designs an efficient algorithm to discover the SQL queries that contain both joins and OLAP-style aggregations which are substantially for querying OLAP data warehouses and shows that it is adaptable and scalable for large databases.
FastQRE: Fast Query Reverse Engineering
- Computer ScienceSIGMOD Conference
- 2018
This work proposes a novel approach for solving the QRE problem efficiently, which outperforms the existing state of the art by 2-3 orders of magnitude for complex queries, resolving those queries in seconds rather than days, thus making the approach more practical in real-life settings.
Exploring Data through Ranked Entities
- Computer Science
- 2019
This thesis presents COMPETE, a novel approach that models and computes dominance over user-provided input entities, given a database of top-k rankings, using a probabilistic model that is estimating the result sizes.
SQUARES : A SQL Synthesizer Using Query Reverse Engineering
- Computer ScienceProc. VLDB Endow.
- 2020
A novel enumeration-based SQL synthesizer SQUARES is proposed, that uses a new line representation where each program line with its own subtree is represented, which allows a faster enumeration of programs when compared to the usual tree-based encoding.
FastQRE : FastQuery Reverse Engineering
- Computer Science
- 2018
This work proposes a novel approach for solving the QRE problem efficiently, which outperforms the existing state of the art by 2–3 orders of magnitude for complex queries, resolving those queries in seconds rather than days, thus making the approach more practical in real-life settings.
Reverse engineering database queries from examples: State-of-the-art, challenges, and research opportunities
- Computer ScienceInf. Syst.
- 2019
PATSQL: Efficient Synthesis of SQL Queries from Example Tables with Quick Inference of Projected Columns
- Computer ScienceProc. VLDB Endow.
- 2021
This paper proposes an efficient algorithm that synthesizes SQL queries from I/O tables that has strengths in both the execution time and the scale of supported tables.
Example-Driven Query Intent Discovery: Abductive Reasoning using Semantic Similarity
- Computer ScienceProc. VLDB Endow.
- 2019
This work designs an end-to-end system that automatically formulates select-project-join queries in an open-world setting, with optional group-by aggregation and intersection operators, and expresses the problem of query intent discovery using a probabilistic abduction model, that infers a query as the most likely explanation of the provided examples.
References
SHOWING 1-10 OF 16 REFERENCES
Reverse engineering complex join queries
- Computer ScienceSIGMOD '13
- 2013
An efficient algorithm that discovers queries with arbitrary join graphs by exploring the set of candidate solutions in a principled way and quickly prune out a large number of infeasible graphs is proposed.
Reverse Engineering Top-k Database Queries with PALEO
- Computer ScienceEDBT
- 2016
This work addresses the problem of reverse engineering top-k queries over a database, that is, given a relation R and a sample topk result list, and aims at determining an SQL query that returns the provided input result when executed over R.
Query From Examples: An Iterative, Data-Driven Approach to Query Construction
- Computer ScienceProc. VLDB Endow.
- 2015
A new approach, called Query from Examples (QFE), to help non-expert database users construct SQL queries, which seeks to minimize the effort needed by a user to determine if a new database-result pair is consistent with his or her desired query.
Computing similar entity rankings via reverse engineering of top-k database queries
- Computer Science2016 IEEE 32nd International Conference on Data Engineering Workshops (ICDEW)
- 2016
This work addresses the problem of determining queries that compute lists similar to a user-specified input ranking for a ranked list of entities L and a similarity threshold θ, and shows that its system is able to achieve a Recall@10 higher than 80%.
Query by output
- Computer ScienceSIGMOD Conference
- 2009
This paper presents a novel data-driven approach, called Query By Output (QBO), which can enhance the usability of database systems and designs several optimization techniques to reduce processing overhead and introduce a set of criteria to rank order output queries by various notions of utility.
Learning Join Queries from User Examples
- Computer ScienceACM Trans. Database Syst.
- 2016
The frontier between tractability and intractability is precisely characterized for the following problems of interest in these settings: consistency checking, learnability, and deciding the informativeness of a tuple.
Discovering queries based on example tuples
- Computer ScienceSIGMOD Conference
- 2014
This work studies the problem of discovering the minimal project join query that contains the given example tuples in its output and proposes novel algorithms to solve this problem.
DBXplorer: enabling keyword search over relational databases
- Computer Science, EconomicsSIGMOD '02
- 2002
Just as keyword search and classification hierarchies complement each other for document search, keyword search over databases can be effective.