• Publications
  • Influence
PPDB: The Paraphrase Database
We present the 1.0 release of our paraphrase database, PPDB. Its English portion, PPDB:Eng, contains over 220 million paraphrase pairs, consisting of 73 million phrasal and 8 million lexicalExpand
  • 597
  • 104
  • Open Access
PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification
We present a new release of the Paraphrase Database. PPDB 2.0 includes a discriminatively re-ranked set of paraphrases that achieve a higher correlation with human judgments than PPDB 1.0’s heuristicExpand
  • 217
  • 55
  • Open Access
Annotated Gigaword
We have created layers of annotation on the English Gigaword v.5 corpus to render it useful as a standardized corpus for knowledge extraction and distributional semantics. Most existing large-scaleExpand
  • 220
  • 32
  • Open Access
Hypothesis Only Baselines in Natural Language Inference
We propose a hypothesis only baseline for diagnosing Natural Language Inference (NLI). Especially when an NLI dataset assumes inference is occurring based purely on the relationship between a contextExpand
  • 169
  • 29
  • Open Access
Gender Bias in Coreference Resolution
We present an empirical study of gender bias in coreference resolution systems. We first introduce a novel, Winograd schema-style set of minimal pair sentences that differ only by pronoun gender.Expand
  • 107
  • 21
  • Open Access
What do you learn from context? Probing for sentence structure in contextualized word representations
Contextualized representation models such as ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2018) have recently achieved state-of-the-art results on a diverse array of downstream NLP tasks.Expand
  • 202
  • 18
  • Open Access
Open Domain Targeted Sentiment
We propose a novel approach to sentiment analysis for a low resource setting. The intuition behind this work is that sentiment expressed towards an entity, targeted sentiment, may be viewed as a spanExpand
  • 99
  • 17
  • Open Access
Answer Extraction as Sequence Tagging with Tree Edit Distance
Our goal is to extract answers from preretrieved sentences for Question Answering (QA). We construct a linear-chain Conditional Random Field based on pairs of questions and their possible answerExpand
  • 181
  • 16
  • Open Access
Efficient spoken term discovery using randomized algorithms
  • A. Jansen, B. V. Durme
  • Computer Science
  • IEEE Workshop on Automatic Speech Recognition…
  • 1 December 2011
Spoken term discovery is the task of automatically identifying words and phrases in speech data by searching for long repeated acoustic patterns. Initial solutions relied on exhaustive dynamic timeExpand
  • 129
  • 14
  • Open Access
ReCoRD: Bridging the Gap between Human and Machine Commonsense Reading Comprehension
We present a large-scale dataset, ReCoRD, for machine reading comprehension requiring commonsense reasoning. Experiments on this dataset demonstrate that the performance of state-of-the-art MRCExpand
  • 67
  • 14
  • Open Access