ConRPG: Paraphrase Generation using Contexts as Regularizer

@inproceedings{Meng2021ConRPGPG,
  title={ConRPG: Paraphrase Generation using Contexts as Regularizer},
  author={Yuxian Meng and Xiang Ao and Qing He and Xiaofei Sun and Qinghong Han and Fei Wu and Chun Fan and Jiwei Li},
  booktitle={Conference on Empirical Methods in Natural Language Processing},
  year={2021}
}
A long-standing issue with paraphrase generation is the lack of reliable supervision signals. In this paper, we propose a new unsupervised paradigm for paraphrase generation based on the assumption that the probabilities of generating two sentences with the same meaning given the same context should be the same. Inspired by this fundamental idea, we propose a pipelined system which consists of paraphrase candidate generation based on contextual language models, candidate filtering using scoring… 

Figures and Tables from this paper

Paraphrase Generation as Unsupervised Machine Translation

In this paper, we propose a new paradigm for paraphrase generation by treating the task as unsupervised machine translation (UMT) based on the assumption that there must be pairs of sentences

Learning to Adapt to Low-Resource Paraphrase Generation

LAPA is an effective adapter for PLMs optimized by meta-learning that enables paraphrase generation models to learn basic language knowledge first, then learn the paraphrasing task itself later, and finally adapt to the target task.

Parameter-efficient feature-based transfer for paraphrase identification

This work proposes a pre-trained task-specific architecture that is competitive with adapter-BERT (a parameter-efficient fine-tuning approach) over some tasks while consuming only 16% trainable parameters and saving 69-96% time for parameter update.

Deep Latent Variable Models for Semi-supervised Paraphrase Generation

A novel unsupervised model named variational sequence auto-encoding reconstruction ( VSAR) is presented, which performs latent sequence inference given an observed text, and a supervised model named dual directional learning ( DDL) is introduced, which enables semi-supervised learning.

Hierarchical Sketch Induction for Paraphrase Generation

Hierarchical Refinement Quantized Variational Autoencoders (HRQ-VAE), a method for learning decompositions of dense encodings as a sequence of discrete latent variables that make iterative refinements of increasing granularity, is introduced.

Chinese Idiom Paraphrasing

A large-scale CIP dataset is established based on human and machine collaboration, which consists of 115,530 sentence pairs and three baselines and two novel CIP approaches are deployed to deal with CIP problems, showing that the proposed methods have better performances than the baselines based on the established CIP datasets.

References

SHOWING 1-10 OF 65 REFERENCES

A Deep Generative Framework for Paraphrase Generation

Quantitative evaluation of the proposed method on a benchmark paraphrase dataset demonstrates its efficacy, and its performance improvement over the state-of-the-art methods by a significant margin, whereas qualitative human evaluation indicate that the generated paraphrases are well-formed, grammatically correct, and are relevant to the input sentence.

Paraphrase Generation by Learning How to Edit from Samples

Experimental results show the superiority of the paraphrase generation method in terms of both automatic metrics, and human evaluation of relevance, grammaticality, and diversity of generated paraphrases.

Joint Learning of a Dual SMT System for Paraphrase Generation

A joint learning method of two SMT systems to optimize the process of paraphrase generation and a revised BLEU score (called iBLEU) which measures the adequacy and diversity of the generated paraphrase sentence is proposed for tuning parameters inSMT systems.

Unsupervised Paraphrasing via Deep Reinforcement Learning

Progressive Unsupervised Paraphrasing (PUP) is proposed: a novel unsupervised paraphrase generation method based on deep reinforcement learning (DRL) that outperforms un supervised state-of-the-art paraphrasing techniques in terms of both automatic metrics and user studies on four real datasets.

Paraphrase Generation with Deep Reinforcement Learning

Experimental results on two datasets demonstrate the proposed models can produce more accurate paraphrases and outperform the state-of-the-art methods in paraphrase generation in both automatic evaluation and human evaluation.

Reformulating Unsupervised Style Transfer as Paraphrase Generation

This paper reformulates unsupervised style transfer as a paraphrase generation problem, and presents a simple methodology based on fine-tuning pretrained language models on automatically generated paraphrase data that significantly outperforms state-of-the-art style transfer systems on both human and automatic evaluations.

Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation

This work introduces a novel model based on the encoder-decoder framework, called Word Embedding Attention Network (WEAN), which generates the words by querying distributed word representations (i.e. neural word embeddings), hoping to capturing the meaning of the according words.

Unsupervised Paraphrase Generation using Pre-trained Language Models

The experiments show that paraphrases generated with the GPT-2 model are of good quality, are diverse and improves the downstream task performance when used for data augmentation.

ParaBank: Monolingual Bitext Generation and Sentential Paraphrasing via Lexically-constrained Neural Machine Translation

ParaBank is presented, a large-scale English paraphrase dataset that surpasses prior work in both quantity and quality and is used to train a monolingual NMT model with the same support for lexically-constrained decoding for sentence rewriting tasks.

CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling

This paper proposes CGMH, a novel approach using Metropolis-Hastings sampling for constrained sentence generation that allows complicated constraints such as the occurrence of multiple keywords in the target sentences, which cannot be handled in traditional RNN-based approaches.
...