Improving Text-to-SQL Evaluation Methodology

@article{FineganDollak2018ImprovingTE,
  title={Improving Text-to-SQL Evaluation Methodology},
  author={Catherine Finegan-Dollak and Jonathan K. Kummerfeld and Li Zhang and Karthik Ramanathan and Sesh Sadasivam and Rui Zhang and Dragomir R. Radev},
  journal={ArXiv},
  year={2018},
  volume={abs/1806.09029}
}
To be informative, an evaluation must measure how well systems generalize to realistic unseen data. [...] Key Method To facilitate evaluation on multiple datasets, we release standardized and improved versions of seven existing datasets and one new text-to-SQL dataset. Second, we show that the current division of data into training and test sets measures robustness to variations in the way questions are asked, but only partially tests how well systems generalize to new queries; therefore, we propose a…Expand

Figures, Tables, and Topics from this paper

Semantic Evaluation for Text-to-SQL with Distilled Test Suite
We propose test suite accuracy to approximate semantic accuracy for Text-to-SQL models. Our method distills a small test suite of databases that achieves high code coverage for the gold query from aExpand
An In-Depth Benchmarking of Text-to-SQL Systems
TLDR
A text-to-SQL benchmark is built that covers different classes of queries, and the effectiveness of several systems in the field is evaluated, to evaluate system efficiency and execution time and resource consumption. Expand
DuoRAT: Towards Simpler Text-to-SQL Models
TLDR
This paper begins by building DuoRAT, a re-implementation of the state-of-the-art RAT-SQL model that unlike RAC-SQL is using only relation-aware or vanilla transformers as the building blocks, and performs several ablation experiments using DuoR AT as the baseline model. Expand
Zero-shot Text-to-SQL Learning with Auxiliary Task
TLDR
This paper first diagnose the bottleneck of text-to-SQL task by providing a new testbed, in which it is observed that existing models present poor generalization ability on rarely-seen data, and designs a simple but effective auxiliary task which serves as a supportive model as well as a regularization term to the generation task to increase the models generalization. Expand
Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data
TLDR
This work releases SEDE, a dataset with 12,023 pairs of utterances and SQL queries collected from real usage on the Stack Exchange website, and shows that these pairs contain a variety of real-world challenges which were rarely reflected so far in any other semantic parsing dataset. Expand
ColloQL: Robust Text-to-SQL Over Search Queries
TLDR
This work introduces data augmentation techniques and a sampling-based content-aware BERT model (ColloQL) to achieve robust text-to-SQL modeling over natural language search (NLS) questions and reveals ColloQL’s superior performance extends to well-formed text. Expand
ChiTeSQL: A Large-Scale and Pragmatic Chinese Text-to-SQL Dataset
TLDR
This paper presents DuSQL, a larges-scale and pragmatic Chinese dataset for the cross-domain text-toSQL task, containing 200 databases, 813 tables, and 23,797 question/SQL pairs, and adopts an effective data construction framework via human-computer collaboration. Expand
DuSQL: A Large-Scale and Pragmatic Chinese Text-to-SQL Dataset
TLDR
This paper presents DuSQL, a larges-scale and pragmatic Chinese dataset for the cross-domain text-to-SQL task, containing 200 databases, 813 tables, and 23,797 question/SQL pairs, and adopts an effective data construction framework via human-computer collaboration. Expand
IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles
TLDR
A sequence-to-action parsing approach for the natural language to SQL task that incrementally fills the slots of a SQL query with feasible actions from a pre-defined inventory and sets a new state-of-the-art performance at an execution accuracy of 87.1%. Expand
Natural language to SQL: Where are we today?
TLDR
A comprehensive survey of recent NL2SQL methods is provided, introducing a taxonomy of them and a practical tool for validation by using existing, mature database technologies such as query rewrite and database testing. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 34 REFERENCES
Translating Questions to SQL Queries with Generative Parsers Discriminatively Reranked
TLDR
A model for automatically translating a factoid question in natural language to an SQL query that retrieves the correct answer from a target relational database (DB) is defined, which is in line with the best models using external and expensive hand-crafted resources such as the question meaning interpretation. Expand
Towards a theory of natural language interfaces to databases
TLDR
This paper proves that, for a broad class of semantically tractable natural language questions, Precise is guaranteed to map each question to the corresponding SQL query, and shows that Precise compares favorably with Mooney's learning NLI and with Microsoft's English Query product. Expand
Seq 2 SQL : Generating Structured Queries from Natural Language using Reinforcement Learning
A significant amount of the world’s knowledge is stored in relational databases. However, the ability for users to retrieve facts from a database is limited due to a lack of understanding of queryExpand
Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning
TLDR
This work proposes Seq2 SQL, a deep neural network for translating natural language questions to corresponding SQL queries, and releases WikiSQL, a dataset of 80654 hand-annotated examples of questions and SQL queries distributed across 24241 tables fromWikipedia that is an order of magnitude larger than comparable datasets. Expand
Type- and Content-Driven Synthesis of SQL Queries from Natural Language
TLDR
This paper presents a new technique for automatically synthesizing SQL queries from natural language by combining natural language processing, program synthesis, and automated program repair that works for any database without requiring additional customization and does not require users to know the underlying database schema. Expand
SQLizer: query synthesis from natural language
This paper presents a new technique for automatically synthesizing SQL queries from natural language (NL). At the core of our technique is a new NL-based program synthesis methodology that combinesExpand
Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection
TLDR
This paper presents the first systematic study of the key factors in crowdsourcing paraphrase collection, considering variations in instructions, incentives, data domains, and workflows and manually analyzed paraphrases for correctness, grammaticality, and linguistic diversity. Expand
Natural Language Interfaces to Databases: An Analysis of the State of the Art
TLDR
This chapter presents an analysis of NLIDBs, which includes their classification, techniques, advantages, disadvantages, and a proposal for a proper evaluation of them. Expand
Constructing an Interactive Natural Language Interface for Relational Databases
TLDR
The architecture of an interactive natural language query interface for relational databases is described, able to correctly interpret complex natural language queries, in a generic manner across a range of domains, and is good enough to be usable in practice. Expand
Learning a Neural Semantic Parser from User Feedback
We present an approach to rapidly and easily build natural language interfaces to databases for new domains, whose performance improves over time based on user feedback, and requires minimalExpand
...
1
2
3
4
...