CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases

@inproceedings{Yu2019CoSQLAC,
  title={CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases},
  author={Tao Yu and Rui Zhang and He Yang Er and Suyi Li and Eric Xue and Bo Pang and Xi Victoria Lin and Yi Chern Tan and Tianze Shi and Zihan Li and Youxuan Jiang and Michihiro Yasunaga and Sungrok Shim and Tao Chen and Alexander R. Fabbri and Zifan Li and Luyao Chen and Yuwen Zhang and Shreya Dixit and Vincent Zhang and Caiming Xiong and Richard Socher and Walter S. Lasecki and Dragomir R. Radev},
  booktitle={EMNLP},
  year={2019}
}
We present CoSQL, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems. [...] Key Result We evaluate a set of strong baselines for each task and show that CoSQL presents significant challenges for future research. The dataset, baselines, and leaderboard will be released at this https URL.Expand

Paper Mentions

Service-oriented Text-to-SQL Parsing
TLDR
A service-oriented Text-to-SQL parser that translates natural language utterance to structural and executable SQL query and a algorithmic framework named Semantic-Enriched SQL generator (SE-SQL) that enables flexibly access database than rigid API in the application while keeping the performance quality for the most commonly used cases. Expand
Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions
TLDR
The interaction history is utilized by editing the previous predicted query to improve the generation quality of SQL queries and the benefit of editing compared with the state-of-the-art baselines which generate SQL from scratch is evaluated. Expand
Efficient Deployment of Conversational Natural Language Interfaces over Databases
TLDR
This work proposes a novel method for accelerating the training dataset collection for developing the natural language-to-query-language machine learning models, and allows one to generate conversational multi-term data, where multiple turns define a dialogue session, enabling one to better utilize chatbot interfaces. Expand
Decoupled Dialogue Modeling and Semantic Parsing for Multi-Turn Text-to-SQL
TLDR
This paper proposes a novel decoupled multi-turn Text-to-SQL framework, where an utterance rewrite model first explicitly solves completion of dialogue context, and then a single-turn textscript parser follows to address the data sparsity problem. Expand
Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering
TLDR
This paper proposes a hybrid framework that takes both textual and tabular evidence as input and generates either direct answers or SQL queries depending on which form could better answer the question, and achieves state-of-theart performance on OpenSQuAD dataset using a T5-base model. Expand
AirConcierge: Generating Task-Oriented Dialogue via Efficient Large-Scale Knowledge Retrieval
TLDR
This work proposes, an end-to-end trainable text- to-SQL guided framework to learn a neural agent that interacts with KBs using the generated SQL queries, and evaluates the proposed method on the AirDialogue dataset, a large corpus released by Google containing the conversations of customers booking flight tickets from the agent. Expand
Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing
TLDR
This work presents BRIDGE, a powerful sequential architecture for modeling dependencies between natural language questions and relational databases in cross-DB semantic parsing that effectively captures the desired cross-modal dependencies and has the potential to generalize to more text-DB related tasks. Expand
Tracking Interaction States for Multi-Turn Text-to-SQL Semantic Parsing
TLDR
Two kinds of interaction states are defined based on schema items and SQL keywords separately, and a relational graph neural network and a non-linear layer are designed to update the representations of these two states respectively. Expand
DIY: Assessing the Correctness of Natural Language to SQL Systems
TLDR
DIY is presented, an interactive technique that enables users to assess the responses from a state-of-the-art natural language to SQL system for correctness and, if possible, fix errors and how DIY helps users assess the correctness of the system’s answers and detect & fix errors. Expand
PG-GSQL: Pointer-Generator Network with Guide Decoding for Cross-Domain Context-Dependent Text-to-SQL Generation
TLDR
An encoder-decoder model called PG-GSQL based on the interaction-level encoder and with two effective innovations in decoder to solve cross-domain context-dependent text-to-SQL generation task is presented. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 63 REFERENCES
Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation
TLDR
The proposed IRNet aims to address two challenges: the mismatch between intents expressed in natural language (NL) and the implementation details in SQL and the challenge in predicting columns caused by the large number of out-of-domain words. Expand
Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions
TLDR
The interaction history is utilized by editing the previous predicted query to improve the generation quality of SQL queries and the benefit of editing compared with the state-of-the-art baselines which generate SQL from scratch is evaluated. Expand
SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-Domain Text-to-SQL Task
TLDR
Experimental results show that SyntaxSQLNet can handle a significantly greater number of complex SQL examples than prior work, outperforming the previous state-of-the-art model by 9.5% in exact matching accuracy. Expand
SParC: Cross-Domain Semantic Parsing in Context
TLDR
An in-depth analysis of SParC is provided and it is shown that it introduces new challenges compared to existing datasets and requires generalization to unseen domains due to its cross-domain nature and the unseen databases at test time. Expand
Towards a theory of natural language interfaces to databases
TLDR
This paper proves that, for a broad class of semantically tractable natural language questions, Precise is guaranteed to map each question to the corresponding SQL query, and shows that Precise compares favorably with Mooney's learning NLI and with Microsoft's English Query product. Expand
Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning
TLDR
This work proposes Seq2 SQL, a deep neural network for translating natural language questions to corresponding SQL queries, and releases WikiSQL, a dataset of 80654 hand-annotated examples of questions and SQL queries distributed across 24241 tables fromWikipedia that is an order of magnitude larger than comparable datasets. Expand
Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task
TLDR
This work defines a new complex and cross-domain semantic parsing and text-to-SQL task so that different complicated SQL queries and databases appear in train and test sets and experiments with various state-of-the-art models show that Spider presents a strong challenge for future research. Expand
SQLizer: query synthesis from natural language
This paper presents a new technique for automatically synthesizing SQL queries from natural language (NL). At the core of our technique is a new NL-based program synthesis methodology that combinesExpand
Constructing an Interactive Natural Language Interface for Relational Databases
TLDR
The architecture of an interactive natural language query interface for relational databases is described, able to correctly interpret complex natural language queries, in a generic manner across a range of domains, and is good enough to be usable in practice. Expand
IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles
TLDR
A sequence-to-action parsing approach for the natural language to SQL task that incrementally fills the slots of a SQL query with feasible actions from a pre-defined inventory and sets a new state-of-the-art performance at an execution accuracy of 87.1%. Expand
...
1
2
3
4
5
...