Share This Author
Towards Universal Paraphrastic Sentence Embeddings
This work considers the problem of learning general-purpose, paraphrastic sentence embeddings based on supervision from the Paraphrase Database, and compares six compositional architectures, finding that the most complex architectures, such as long short-term memory (LSTM) recurrent neural networks, perform best on the in-domain data.
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks
A combination of automated and human evaluations show that SCPNs generate paraphrases that follow their target specifications without decreasing paraphrase quality when compared to baseline (uncontrolled) paraphrase systems.
From Paraphrase Database to Compositional Paraphrase Model and Back
This work proposes models to leverage the phrase pairs from the Paraphrase Database to build parametric paraphrase models that score paraphrase pairs more accurately than the PPDB’s internal scores while simultaneously improving its coverage.
ParaNMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations
This work uses ParaNMT-50M, a dataset of more than 50 million English-English sentential paraphrase pairs, to train paraphrastic sentence embeddings that outperform all supervised systems on every SemEval semantic textual similarity competition, in addition to showing how it can be used for paraphrase generation.
Reformulating Unsupervised Style Transfer as Paraphrase Generation
This paper reformulates unsupervised style transfer as a paraphrase generation problem, and presents a simple methodology based on fine-tuning pretrained language models on automatically generated paraphrase data that significantly outperforms state-of-the-art style transfer systems on both human and automatic evaluations.
Charagram: Embedding Words and Sentences via Character n-grams
It is demonstrated that Charagram embeddings outperform more complex architectures based on character-level recurrent and convolutional neural networks, achieving new state-of-the-art performance on several similarity tasks.
Revisiting Recurrent Networks for Paraphrastic Sentence Embeddings
This work considers the problem of learning general-purpose, paraphrastic sentence embeddings, revisiting the setting of Wieting et al. (2016b), and presents several developments that together produce the opposite conclusion.
Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
Canine is presented, a neural encoder that operates directly on character sequences—without explicit tokenization or vocabulary—and a pre-training strategy that operates either directly on characters or optionally uses subwords as a soft inductive bias.
Beyond BLEU:Training Neural Machine Translation with Semantic Similarity
An alternative reward function for optimizing NMT systems that is based on recent work in semantic similarity is introduced, and it is found that training with the proposed metric results in better translations as evaluated by BLEU, semantic similarity, and human evaluation, and that the optimization procedure converges faster.
Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext
We consider the problem of learning general-purpose, paraphrastic sentence embeddings in the setting of Wieting et al. (2016b). We use neural machine translation to generate sentential paraphrases…