Share This Author
Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task
This work defines a new complex and cross-domain semantic parsing and text-to-SQL task so that different complicated SQL queries and databases appear in train and test sets and experiments with various state-of-the-art models show that Spider presents a strong challenge for future research.
TypeSQL: Knowledge-Based Type-Aware Neural Text-to-SQL Generation
This paper presents a novel approach TypeSQL which formats the problem as a slot filling task in a more reasonable way and utilizes type information to better understand rare entities and numbers in the questions.
SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-Domain Text-to-SQL Task
Experimental results show that SyntaxSQLNet can handle a significantly greater number of complex SQL examples than prior work, outperforming the previous state-of-the-art model by 9.5% in exact matching accuracy.
Improving Text-to-SQL Evaluation Methodology
It is shown that the current division of data into training and test sets measures robustness to variations in the way questions are asked, but only partially tests how well systems generalize to new queries, and proposes a complementary dataset split for evaluation of future work.
Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions
The interaction history is utilized by editing the previous predicted query to improve the generation quality of SQL queries and the benefit of editing compared with the state-of-the-art baselines which generate SQL from scratch is evaluated.
Graph-based Neural Multi-Document Summarization
- Michihiro Yasunaga, Rui Zhang, Kshitijh Meelu, Ayush Pareek, K. Srinivasan, Dragomir R. Radev
- Computer ScienceCoNLL
- 20 June 2017
This model improves upon other traditional graph-based extractive approaches and the vanilla GRU sequence model with no graph, and it achieves competitive results against other state-of-the-art multi-document summarization systems.
ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks
The first large-scale manually-annotated corpus for scientific papers is developed and released by enabling faster annotation and summarization methods that integrate the authors’ original highlights and the article’s actual impacts on the community are proposed, to create comprehensive, hybrid summaries.
SParC: Cross-Domain Semantic Parsing in Context
An in-depth analysis of SParC is provided and it is shown that it introduces new challenges compared to existing datasets and requires generalization to unseen domains due to its cross-domain nature and the unseen databases at test time.
CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases
CoSQL is presented, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems that includes SQL-grounded dialogue state tracking, response generation from query results, and user dialogue act prediction and a set of strong baselines are evaluated.
Dependency Sensitive Convolutional Neural Networks for Modeling Sentences and Documents
DSCNN hierarchically builds textual representations by processing pretrained word embeddings via Long ShortTerm Memory networks and subsequently extracting features with convolution operators, and does not rely on parsers and expensive phrase labeling, and thus is not restricted to sentencelevel tasks.