Learning to Selectively Learn for Weakly-supervised Paraphrase Generation

  title={Learning to Selectively Learn for Weakly-supervised Paraphrase Generation},
  author={Kaize Ding and Dingcheng Li and Alexander Hanbo Li and Xing Fan and Chenlei Guo and Yang Liu and Huan Liu},
Paraphrase generation is a longstanding NLP task that has diverse applications on downstream NLP tasks. However, the effectiveness of existing efforts predominantly relies on large amounts of golden labeled data. Though unsupervised endeavors have been proposed to alleviate this issue, they may fail to generate meaningful paraphrases due to the lack of supervision signals. In this work, we go beyond the existing paradigms and propose a novel approach to generate high-quality paraphrases with… 

Figures and Tables from this paper

Learning to Selectively Learn for Weakly Supervised Paraphrase Generation with Model-based Reinforcement Learning

A new weakly supervised paraphrase generation approach is proposed that extends the success of a recent work that leverages reinforcement learning for effective model training with data selection and could partially overcome the discussed issues with a model-based planning feature and a reward normalization feature.

Teacher Forcing Recovers Reward Functions for Text Generation

This work proposes a task-agnostic approach that derives a step-wise reward function directly from a model trained with teacher forcing, and proposes a simple modi-cation to stabilize the RL training on non-parallel datasets with the authors' induced reward function.

Learning to Augment for Casual User Recommendation

A model-agnostic framework L2Aug is proposed to improve recommendations for casual users through data augmentation, without sacrificing core user experience, and outperforms other treatment methods and achieves the best sequential recommendation performance for both casual and core users.



Unsupervised Paraphrasing via Deep Reinforcement Learning

Progressive Unsupervised Paraphrasing (PUP) is proposed: a novel unsupervised paraphrase generation method based on deep reinforcement learning (DRL) that outperforms un supervised state-of-the-art paraphrasing techniques in terms of both automatic metrics and user studies on four real datasets.

Paraphrase Generation with Deep Reinforcement Learning

Experimental results on two datasets demonstrate the proposed models can produce more accurate paraphrases and outperform the state-of-the-art methods in paraphrase generation in both automatic evaluation and human evaluation.

Automatically Paraphrasing via Sentence Reconstruction and Round-trip Translation

A novel framework for paraphrase generation that simultaneously decodes the output sentence using a pretrained wordset-to-sequence model and a round-trip translation model is proposed and used to augment the training data for machine translation to achieve substantial improvements.

Generating Syntactically Controlled Paraphrases without Using Annotated Parallel Pairs

This paper proposes Syntactically controlled Paraphrase Generator (SynPG), an encoder-decoder based model that learns to disentangle the semantics and the syntax of a sentence from a collection of unannotated texts that performs better syntactic control than unsupervised baselines while the quality of the generated paraphrases is competitive.

Neural Text Simplification in Low-Resource Conditions Using Weak Supervision

This paper exploits large amounts of heterogeneous data to automatically select simple sentences, which are then used to create synthetic simplification pairs, and evaluates other solutions, such as oversampling and the use of external word embeddings to be fed to the neural simplification system.

Paraphrase Generation by Learning How to Edit from Samples

Experimental results show the superiority of the paraphrase generation method in terms of both automatic metrics, and human evaluation of relevance, grammaticality, and diversity of generated paraphrases.

Neural Paraphrase Generation with Stacked Residual LSTM Networks

This work is the first to explore deep learning models for paraphrase generation with a stacked residual LSTM network, where it adds residual connections between L STM layers for efficient training of deep LSTMs.

Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext

We consider the problem of learning general-purpose, paraphrastic sentence embeddings in the setting of Wieting et al. (2016b). We use neural machine translation to generate sentential paraphrases

Joint Learning of a Dual SMT System for Paraphrase Generation

A joint learning method of two SMT systems to optimize the process of paraphrase generation and a revised BLEU score (called iBLEU) which measures the adequacy and diversity of the generated paraphrase sentence is proposed for tuning parameters inSMT systems.

A Continuously Growing Dataset of Sentential Paraphrases

A new method to collect large-scale sentential paraphrases from Twitter by linking tweets through shared URLs is presented, which presents the largest human-labeled paraphrase corpus to date of 51,524 sentence pairs and the first cross-domain benchmarking for automatic paraphrase identification.