• Publications
  • Influence
Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task
TLDR
We present Spider, a large-scale complex and cross-domain semantic parsing and text-to-SQL task so that different complicated SQL queries and databases appear in train and test sets. Expand
  • 135
  • 55
  • PDF
SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-Domain Text-to-SQL Task
TLDR
We propose SyntaxSQLNet, a syntax tree network to address the complex and cross-domain text-to-SQL generation task. Expand
  • 67
  • 37
  • PDF
ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks
TLDR
We develop and release the first large-scale manually-annotated corpus for scientific papers (on computational linguistics) by enabling faster annotation, and propose summarization methods that integrate the authors’ original highlights (abstract) and the article’s actual impacts in the community (citations), to create comprehensive, hybrid summaries. Expand
  • 47
  • 12
  • PDF
Graph-based Neural Multi-Document Summarization
TLDR
We propose a neural multi-document summarization system that incorporates sentence relation graphs, with sentence embeddings obtained from Recurrent Neural Networks as input node features. Expand
  • 110
  • 11
  • PDF
Robust Multilingual Part-of-Speech Tagging via Adversarial Training
TLDR
Adversarial training (AT) is a powerful regularization method for neural networks, aiming to achieve robustness to input perturbations. Expand
  • 51
  • 8
  • PDF
Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering
TLDR
We propose to improve the end-to-end coreference resolution system by (1) using a biaffine attention model to get antecedent scores for each possible mention, and (2) jointly optimizing the mention detection accuracy and the mention clustering log-likelihood given the mention cluster labels. Expand
  • 27
  • 8
  • PDF
SParC: Cross-Domain Semantic Parsing in Context
TLDR
We present SParC, a dataset for cross-domainSemanticParsing inContext that consists of 4,298 coherent question sequences (12k+ individual questions annotated with SQL queries). Expand
  • 31
  • 8
  • PDF
CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases
TLDR
We present CoSQL, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems. Expand
  • 32
  • 6
  • PDF
Overview and Results: CL-SciSumm Shared Task 2019
TLDR
The CL-SciSumm Shared Task is the first medium-scale shared task on scientific document summarization in the computational linguistics(CL) domain. Expand
  • 21
  • 2
  • PDF
TopicEq: A Joint Topic and Mathematical Equation Model for Scientific Texts
TLDR
We propose a novel topic model that jointly generates mathematical equations and their surrounding text (TopicEq). Expand
  • 8
  • 1
  • PDF