Provenance for Natural Language Queries

@article{Deutch2017ProvenanceFN,
  title={Provenance for Natural Language Queries},
  author={Daniel Deutch and Nave Frost and Amir Gilad},
  journal={Proc. VLDB Endow.},
  year={2017},
  volume={10},
  pages={577-588}
}
Multiple lines of research have developed Natural Language (NL) interfaces for formulating database queries. We build upon this work, but focus on presenting a highly detailed form of the answers in NL. The answers that we present are importantly based on the provenance of tuples in the query result, detailing not only the results but also their explanations. We develop a novel method for transforming provenance information to NL, by leveraging the original NL query structure. Furthermore… 
Explaining Natural Language query results
TLDR
This work develops a novel method for transforming provenance information to NL, by leveraging the original NL query structure, and presents two solutions for its effective presentation as NL text: one based on provenance factorization, with novel desiderata relevant to the NL case and one that is based on summarization.
Explaining Queries Over Web Tables to Non-experts
TLDR
This work augments a state-of-the-art NL interface over web tables, enhancing it in both its training and deployment phase, and introduces novel query explanations that provide a graphic representation of the query cell-based provenance in its execution on a given table.
Reverse-Engineering ConjunctiveQueries from Provenance Examples
TLDR
The theoretical analysis shows that there may be many (for some models, even infinitely many in presence of self-joins) consistent queries, yet the algorithms provided provide practically efficient algorithms to find (best-fit) such queries.
Provenance for Non-Experts
TLDR
This paper outlines in this paper the ongoing research and preliminary results, addressing the challenges towards developing provenance solutions that serve to explain computation results to non-expert users.
Explaining Missing Query Results in Natural Language
TLDR
This paper proposes a novel approach to “marry" NLIDBs with an existing model for explaining missing query results by pinpointing the last query operator that is “responsible" for the missing result.
Provenance Summaries for Answers and Non-Answers
TLDR
PUG limits provenance capture to what is relevant to explain a ( Missing) result of interest and uses an efficient sampling-based summarization method to produce compact explanations for (missing) answers.
From Natural Language Questions to SPARQL Queries: A Pattern-based Approach
TLDR
The main contribution of the proposed approach constitutes the simple replaceability of the underlying knowledge base, which is based on general question and query patterns and only accesses the knowledge base for the actual query generation and execution.
Putting Things into Context: Rich Explanations for Query Answers using Join Graphs
TLDR
This work proposes a new approach for explaining query results by augmenting provenance with information from other related tables in the database by using a suite of optimization techniques.
ML Based Lineage in Databases
TLDR
A novel approach for approximating lineage tracking, using a Machine Learning (ML) and Natural Language Processing (NLP) technique; namely, word embedding, and designs an alternative lineage tracking mechanism, that of keeping track of and querying lineage at the column (“gene”) level to better distinguish between the provenance features and the textual characteristics of a tuple.
Fragment-Driven Natural Language Interaction with Databases
TLDR
This work proposes an alternative fragment-driven interaction model, where the system provides an explanation as to how the natural language produced the resulting SQL, which enables the user to interact with the system purely in natural language and to make incremental modifications to their resulting database query without having to learn any SQL.
...
1
2
3
...

References

SHOWING 1-10 OF 49 REFERENCES
NLProv: Natural Language Provenance
TLDR
This work develops a novel method for transforming provenance information to NL, by leveraging the original NL question structure, and presents two solutions for its effective presentation as NL text: one based on provenance factorization with novel desiderata relevant to the NL case, and one that is based on summarization.
Selective Provenance for Datalog Programs Using Top-K Queries
TLDR
A novel top-k query language for querying datalog provenance, supporting selection criteria based on tree patterns and ranking based on the rules and database facts used in derivation, and an efficient novel algorithm based on instrumenting the datalog program so that it generates only relevant provenance.
Querying data provenance
TLDR
A query language for provenance is developed, which can express all of the aforementioned types of queries, as well as many more, and the feasibility of provenance querying and the benefits of the indexing techniques across a variety of application classes and queries are experimentally validated.
TR Discover: A Natural Language Interface for Querying and Analyzing Interlinked Datasets
TLDR
The TR Discover system, a natural language-based system that allows non-technical users to create well-formed questions, is developed for future use with Thomson Reuters Cortellis and is shown to be usable and portable, and report on the relative performance of queries using SQL and SPARQL back ends.
Explaining structured queries in natural language
TLDR
This paper represents various forms of structured queries as directed graphs and annotate the graph edges with template labels using an extensible template mechanism and presents different graph traversal strategies for efficiently exploring these graphs and composing textual query descriptions.
Using SQL for Efficient Generation and Querying of Provenance Information
TLDR
This chapter reviews some of the main contributions of Perm, a DBMS that generates different types of provenance information for complex SQL queries (including nested and correlated subqueries and aggregation).
Quelo Natural Language Interface: Generating queries and answer descriptions
TLDR
This work describes Quelo NLI functionality and presents a grammar-based natural language generation approach that better supports the domain-independent generation of fluent queries and naturally extends for the generation of answers descriptions.
Approximated Summarization of Data Provenance
TLDR
The notion of approximated summarized provenance is introduced, which provides a compact representation of the provenance at the possible cost of information loss, and a novel provenance summarization algorithm is presented which outputs a summary of the input provenance.
Provenance: On and Behind the Screens
TLDR
The second part of this tutorial focuses on enabling users to leverage provenance through adapted visualizations, and will present some fundamental concepts of visualization before discussing possible visualizations for provenance.
A Natural Language Interface for Querying General and Individual Knowledge
TLDR
A modular translation framework that employs new solutions along with state-of-the art NL parsing tools is designed and implemented that provides a high-quality translation for many questions that are not handled by previous translation tools.
...
1
2
3
4
5
...