• Corpus ID: 216913701

Explicit Representation of the Translation Space: Automatic Paraphrasing for Machine Translation Evaluation

@article{Bawden2020ExplicitRO,
  title={Explicit Representation of the Translation Space: Automatic Paraphrasing for Machine Translation Evaluation},
  author={Rachel Bawden and Biao Zhang and Lisa Yankovskaya and Andre Tattar and Matt Post},
  journal={ArXiv},
  year={2020},
  volume={abs/2004.14989}
}
Following previous work on automatic paraphrasing, we assess the feasibility of improving BLEU (Papineni et al., 2002) using state-of-the-art neural paraphrasing techniques to generate additional references. We explore the extent to which diverse paraphrases can adequately cover the space of valid translations and compare to an alternative approach of generating paraphrases constrained by MT outputs. We compare both approaches to human-produced references in terms of diversity and the… 
1 Citations

BLEU Might Be Guilty but References Are Not Innocent

TLDR
This paper develops a paraphrasing task for linguists to perform on existing reference translations, which counteracts this bias and reveals that multi-reference BLEU does not improve the correlation for high quality output, and presents an alternative multi- reference formulation that is more effective.

References

SHOWING 1-10 OF 47 REFERENCES

Filtering Pseudo-References by Paraphrasing for Automatic Evaluation of Machine Translation

TLDR
The experimental results of the WMT 2016 and 2017 datasets show that the proposed method achieved higher correlation with human evaluation than the sentence BLEU (SentBLEU) baselines with a single reference and with unfiltered pseudo-references.

Paraphrasing for Automatic Evaluation

TLDR
It is shown that the use of a paraphrased synthetic reference refines the accuracy of automatic evaluation and there is a strong connection between the quality of automatic paraphrases as judged by humans and their contribution to automatic evaluation.

Paraphrasing Revisited with Neural Machine Translation

TLDR
This paper revisit bilingual pivoting in the context of neural machine translation and presents a paraphrasing model based purely on neural networks, which represents paraphrases in a continuous space, estimates the degree of semantic relatedness between text segments of arbitrary length, and generates candidate paraphrase for any source input.

Automated Paraphrase Lattice Creation for HyTER Machine Translation Evaluation

TLDR
It is shown that although the metric obtains good results on small and carefully curated data with both manually and automatically selected substitutes, it achieves medium performance on much larger and noisier datasets, demonstrating the limits of the metric for tuning and evaluation of current MT systems.

Re-evaluating Machine Translation Results with Paraphrase Support

TLDR
ParaEval is presented, an automatic evaluation framework that uses paraphrases to improve the quality of machine translation evaluations and correlates significantly better than BLEU with human assessment in measurements for both fluency and adequacy.

Generating Diverse Translations with Sentence Codes

TLDR
This work attempts to obtain diverse translations by using sentence codes to condition the sentence generation, and describes two methods to extract the codes, either with or without the help of syntax information.

The Role of Pseudo References in MT Evaluation

TLDR
This paper further examines the approach to train a metric to evaluate a sentence by comparing it against pseudo references, or imperfect "references" produced by off-the-shelf MT systems and shows that the approach generalizes well across multiple years and different source languages.

BLEU Might Be Guilty but References Are Not Innocent

TLDR
This paper develops a paraphrasing task for linguists to perform on existing reference translations, which counteracts this bias and reveals that multi-reference BLEU does not improve the correlation for high quality output, and presents an alternative multi- reference formulation that is more effective.

D-PAGE: Diverse Paraphrase Generation

TLDR
This paper proposes a simple method Diverse Paraphrase Generation (D-PAGE), which extends neural machine translation models to support the generation of diverse paraphrases with implicit rewriting patterns and demonstrates that this model generates at least one order of magnitude more diverse outputs than the baselines.

ParaNMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations

TLDR
This work uses ParaNMT-50M, a dataset of more than 50 million English-English sentential paraphrase pairs, to train paraphrastic sentence embeddings that outperform all supervised systems on every SemEval semantic textual similarity competition, in addition to showing how it can be used for paraphrase generation.