• Publications
  • Influence
PPDB: The Paraphrase Database
TLDR
We present the 1.0 release of our paraphrase database, PPDB, which contains over 220 million paraphrase pairs, consisting of 73 million phrasal and 8 million lexical paraphrases. Expand
  • 612
  • 109
  • PDF
PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification
We present a new release of the Paraphrase Database. PPDB 2.0 includes a discriminatively re-ranked set of paraphrases that achieve a higher correlation with human judgments than PPDB 1.0’s heuristicExpand
  • 227
  • 57
  • PDF
cdec: A Decoder, Alignment, and Learning Framework for Finite- State and Context-Free Translation Models
TLDR
We present cdec, an open source framework for decoding, aligning with, and training a number of statistical machine translation models, including word-based models, phrase- based models, and models based on synchronous context-free grammars. Expand
  • 246
  • 18
  • PDF
The Multilingual Paraphrase Database
TLDR
We release a massive expansion of the paraphrase database (PPDB) that now includes a collection of paraphrases in 23 different languages. Expand
  • 62
  • 7
  • PDF
Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
TLDR
We extend bilingual paraphrase extraction to syntactic paraphrases and demonstrate its ability to learn a variety of general paraphrastic transformations, including passivization, dative shift, and topicalization. Expand
  • 73
  • 7
  • PDF
Joshua 4.0: Packing, PRO, and Paraphrases
TLDR
We present Joshua 4.0, the newest version of our open-source decoder for parsing-based statistical machine translation. Expand
  • 37
  • 5
  • PDF
Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation
TLDR
We describe Joshua (Li et al., 2009a), an open source toolkit for statistical machine translation via synchronous context free grammars that implements chart-parsing, n-gram language model integration, beam- and cube-pruning, and k-best extraction. Expand
  • 18
  • 5
Triplet Lexicon Models for Statistical Machine Translation
TLDR
We present various methods using triplets incorporating long-distance dependencies that can go beyond the local context of phrases or n-gram based language models. Expand
  • 46
  • 3
  • PDF
Joshua 3.0: Syntax-based Machine Translation with the Thrax Grammar Extractor
TLDR
We present progress on Joshua, an open-source decoder for hierarchical and syntax-based machine translation. Expand
  • 35
  • 2
  • PDF
Domain-Specific Paraphrase Extraction
TLDR
The validity of applying paraphrase rules depends on the domain of the text that they are being applied to. Expand
  • 7
  • 2
  • PDF