• Publications
  • Influence
Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task
TLDR
This work defines a new complex and cross-domain semantic parsing and text-to-SQL task so that different complicated SQL queries and databases appear in train and test sets and experiments with various state-of-the-art models show that Spider presents a strong challenge for future research. Expand
TypeSQL: Knowledge-Based Type-Aware Neural Text-to-SQL Generation
TLDR
This paper presents a novel approach TypeSQL which formats the problem as a slot filling task in a more reasonable way and utilizes type information to better understand rare entities and numbers in the questions. Expand
SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-Domain Text-to-SQL Task
TLDR
Experimental results show that SyntaxSQLNet can handle a significantly greater number of complex SQL examples than prior work, outperforming the previous state-of-the-art model by 9.5% in exact matching accuracy. Expand
Improving Text-to-SQL Evaluation Methodology
TLDR
It is shown that the current division of data into training and test sets measures robustness to variations in the way questions are asked, but only partially tests how well systems generalize to new queries, and proposes a complementary dataset split for evaluation of future work. Expand
Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions
TLDR
The interaction history is utilized by editing the previous predicted query to improve the generation quality of SQL queries and the benefit of editing compared with the state-of-the-art baselines which generate SQL from scratch is evaluated. Expand
Graph-based Neural Multi-Document Summarization
TLDR
This model improves upon other traditional graph-based extractive approaches and the vanilla GRU sequence model with no graph, and it achieves competitive results against other state-of-the-art multi-document summarization systems. Expand
ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks
TLDR
The first large-scale manually-annotated corpus for scientific papers is developed and released by enabling faster annotation and summarization methods that integrate the authors’ original highlights and the article’s actual impacts on the community are proposed, to create comprehensive, hybrid summaries. Expand
SParC: Cross-Domain Semantic Parsing in Context
TLDR
An in-depth analysis of SParC is provided and it is shown that it introduces new challenges compared to existing datasets and requires generalization to unseen domains due to its cross-domain nature and the unseen databases at test time. Expand
Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering
TLDR
This paper proposes to improve the end-to-end coreference resolution system by using a biaffine attention model to get antecedent scores for each possible mention, and jointly optimizing the mention detection accuracy and mention clustering accuracy given the mention cluster labels. Expand
CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases
TLDR
CoSQL is presented, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems that includes SQL-grounded dialogue state tracking, response generation from query results, and user dialogue act prediction and a set of strong baselines are evaluated. Expand
...
1
2
3
...