• Corpus ID: 237194932

Contextualizing Variation in Text Style Transfer Datasets

  title={Contextualizing Variation in Text Style Transfer Datasets},
  author={S. Schoch and Wanyu Du and Yangfeng Ji},
Text style transfer involves rewriting the content of a source sentence in a target style. Despite there being a number of style tasks with available data, there has been limited systematic discussion of how text style datasets relate to each other. This understanding, however, is likely to have implications for selecting multiple data sources for model training. While it is prudent to consider inherent stylistic properties when determining these relationships, we also must consider how a style… 

Figures and Tables from this paper



Style Transfer Through Back-Translation

A latent representation of the input sentence is learned which is grounded in a language translation model in order to better preserve the meaning of the sentence while reducing stylistic properties, and adversarial generation techniques are used to make the output match the desired style.

Style is NOT a single variable: Case Studies for Cross-Stylistic Language Understanding

This paper provides the benchmark corpus (XSLUE) that combines existing datasets and collects a new one for sentence-level cross-style language understanding and evaluation and finds that combinations of some contradictive styles likely generate stylistically less appropriate text.

Style Transfer from Non-Parallel Text by Cross-Alignment

This paper proposes a method that leverages refined alignment of latent representations to perform style transfer on the basis of non-parallel text, and demonstrates the effectiveness of this cross-alignment method on three tasks: sentiment modification, decipherment of word substitution ciphers, and recovery of word order.

Evaluating Style Transfer for Text

This paper specifies three aspects of interest (style transfer intensity, content preservation, and naturalness) and shows how to obtain more reliable measures of them from human evaluation than in previous work, and proposes a set of metrics for automated evaluation that are more strongly correlated and in agreement with human judgment.

Dear Sir or Madam, May I Introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer

This work creates the largest corpus for a particular stylistic transfer (formality) and shows that techniques from the machine translation community can serve as strong baselines for future work.

Towards Style Transformation from Written-Style to Audio-Style

A number of linguistic features are suggested to distinguish between the written style and the audio style by consulting the linguistics and journalism literatures and show the correctness of these features and the impact of style transformation on the user experience.

Paraphrasing for Style

This work shows that even with a relatively small amount of parallel training data, it is possible to learn paraphrase models which capture stylistic phenomena, and these models outperform baselines based on dictionaries and out-of-domain parallel text.

Rethinking Text Attribute Transfer: A Lexical Analysis

A lexical analysis framework, the Pivot Analysis, is proposed, to quantitatively analyze the effects of these words in text attribute classification and transfer and identifies the future requirements and challenges of this task.

StyleNet: Generating Attractive Visual Captions with Styles

StyleNet outperforms existing approaches for generating visual captions with different styles, measured in both automatic and human evaluation metrics on the newly collected FlickrStyle10K image caption dataset, which contains 10K Flickr images with corresponding humorous and romantic captions.

Harnessing Pre-Trained Neural Networks with Rules for Formality Style Transfer

This work studies how to harness rules into a state-of-the-art neural network that is typically pretrained on massive corpora and achieves a new state- of- the-art on benchmark datasets.