Corpus ID: 211987786

Data Augmentation using Pre-trained Transformer Models

@article{Kumar2020DataAU,
  title={Data Augmentation using Pre-trained Transformer Models},
  author={Varun Kumar and Ashutosh Choudhary and Eunah Cho},
  journal={ArXiv},
  year={2020},
  volume={abs/2003.02245}
}
  • Varun Kumar, Ashutosh Choudhary, Eunah Cho
  • Published 2020
  • Computer Science
  • ArXiv
  • Language model based pre-trained models such as BERT have provided significant gains across different NLP tasks. In this paper, we study different types of pre-trained transformer based models such as auto-regressive models (GPT-2), auto-encoder models (BERT), and seq2seq models (BART) for conditional data augmentation. We show that prepending the class labels to text sequences provides a simple yet effective way to condition the pre-trained models for data augmentation. On three classification… CONTINUE READING

    Citations

    Publications citing this paper.
    SHOWING 1-2 OF 2 CITATIONS

    G-DAUG: Generative Data Augmentation for Commonsense Reasoning

    VIEW 2 EXCERPTS
    CITES BACKGROUND

    Pre-trained Models for Natural Language Processing: A Survey

    VIEW 1 EXCERPT
    CITES BACKGROUND

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 27 REFERENCES

    Conditional BERT Contextual Augmentation

    VIEW 5 EXCERPTS
    HIGHLY INFLUENTIAL

    RoBERTa: A Robustly Optimized BERT Pretraining Approach

    VIEW 3 EXCERPTS
    HIGHLY INFLUENTIAL

    Attention is All you Need

    VIEW 1 EXCERPT