• Publications
  • Influence
Towards Universal Paraphrastic Sentence Embeddings
TLDR
We consider the problem of learning general-purpose, paraphrastic sentence embeddings based on supervision from the Paraphrase Database (Ganitkevitch et al., 2013). Expand
  • 414
  • 64
  • PDF
From Paraphrase Database to Compositional Paraphrase Model and Back
TLDR
The Paraphrase Database (PPDB; Ganitkevitch et al., 2013) is an extensive semantic resource, consisting of a list of phrase pairs with (heuristic) confidence estimates. Expand
  • 223
  • 42
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks
TLDR
We propose syntactically controlled paraphrase networks (SCPNs) and use them to generate adversarial examples that improve the robustness of pretrained models to syntactic variation when used to augment their training data. Expand
  • 228
  • 30
  • PDF
Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations
TLDR
We extend the work of Wieting et al. (2017), back-translating a large parallel corpus to produce a dataset of more than 51 million English-English sentential paraphrase pairs in a dataset we call ParaNMT-50M. Expand
  • 132
  • 16
  • PDF
Charagram: Embedding Words and Sentences via Character n-grams
TLDR
We present Charagram embeddings, a simple approach for learning character-based compositional models to embed textual sequences, achieving new state-of-the-art performance on several similarity tasks. Expand
  • 150
  • 13
  • PDF
Revisiting Recurrent Networks for Paraphrastic Sentence Embeddings
TLDR
We consider the problem of learning general-purpose, paraphrastic sentence embeddings, revisiting the setting of Wieting et al. (2016b), in which models are trained on noisy paraphrase pairs and evaluated on both STS and supervised semantic tasks. Expand
  • 62
  • 8
  • PDF
Simple and Effective Paraphrastic Similarity from Parallel Translations
TLDR
We present a model and methodology for learning paraphrastic sentence embeddings directly from bitext, removing the time-consuming intermediate step of creating para-phrase corpora. Expand
  • 20
  • 4
  • PDF
Beyond BLEU: Training Neural Machine Translation with Semantic Similarity
TLDR
In this paper, we introduce an alternative reward function for optimizing NMT systems that is based on recent work in semantic similarity. Expand
  • 24
  • 3
  • PDF
Improving Candidate Generation for Low-resource Cross-lingual Entity Linking
TLDR
In this paper, we first assess the problems faced by current entity candidate generation methods for low-resource XEL, then propose three improvements that (1) reduce the disconnect between entity mentions and KB entries, and (2) improve the robustness of the model tolow-resource scenarios. Expand
  • 6
  • 2
  • PDF
No Training Required: Exploring Random Encoders for Sentence Classification
TLDR
We explore various methods for computing sentence representations from pre-trained word embeddings without any training, i.e., using nothing but random parameterizations. Expand
  • 57
  • 1
  • PDF