Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations

@inproceedings{Wieting2018PushingTL,
  title={Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations},
  author={J. Wieting and Kevin Gimpel},
  booktitle={ACL},
  year={2018}
}
  • J. Wieting, Kevin Gimpel
  • Published in ACL 2018
  • Computer Science, Mathematics
  • We extend the work of Wieting et al. (2017), back-translating a large parallel corpus to produce a dataset of more than 51 million English-English sentential paraphrase pairs in a dataset we call ParaNMT-50M. We find this corpus to be cover many domains and styles of text, in addition to being rich in paraphrases with different sentence structure, and we release it to the community. To show its utility, we use it to train paraphrastic sentence embeddings using only minor changes to the… CONTINUE READING
    136 Citations
    Zero-Shot Paraphrase Generation with Multilingual Language Models
    • 9
    • PDF
    Neural Syntactic Preordering for Controlled Paraphrase Generation
    • 10
    • Highly Influenced
    • PDF
    Paraphrase Generation and Evaluation on Colloquial-Style Sentences
    • PDF
    Generating Syntactically Controlled Paraphrases without Using Annotated Parallel Pairs
    • Highly Influenced
    • PDF
    Unsupervised Evaluation Metrics and Learning Criteria for Non-Parallel Textual Transfer
    • 7
    • PDF
    A BILINGUAL GENERATIVE TRANSFORMER
    • 2019

    References

    SHOWING 1-10 OF 76 REFERENCES
    Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext
    • 51
    • PDF
    Towards Universal Paraphrastic Sentence Embeddings
    • 418
    • PDF
    A Simple but Tough-to-Beat Baseline for Sentence Embeddings
    • 741
    From Paraphrase Database to Compositional Paraphrase Model and Back
    • 227
    • PDF
    Revisiting Recurrent Networks for Paraphrastic Sentence Embeddings
    • 62
    • PDF
    A Continuously Growing Dataset of Sentential Paraphrases
    • 55
    • PDF
    Embedding Word Similarity with Neural Machine Translation
    • 52
    • PDF
    Skip-Thought Vectors
    • 1,629
    • Highly Influential
    • PDF
    Paraphrasing Revisited with Neural Machine Translation
    • 112
    • PDF