Corpus ID: 215745042

A Non-Parametric Test to Detect Data-Copying in Generative Models

@article{Meehan2020ANT,
  title={A Non-Parametric Test to Detect Data-Copying in Generative Models},
  author={Casey Meehan and Kamalika Chaudhuri and Sanjoy Dasgupta},
  journal={ArXiv},
  year={2020},
  volume={abs/2004.05675}
}
  • Casey Meehan, Kamalika Chaudhuri, Sanjoy Dasgupta
  • Published 2020
  • Mathematics, Computer Science
  • ArXiv
  • Detecting overfitting in generative models is an important challenge in machine learning. In this work, we formalize a form of overfitting that we call {\em{data-copying}} -- where the generative model memorizes and outputs training samples or small variations thereof. We provide a three sample non-parametric test for detecting data-copying that uses the training set, a separate sample from the target distribution, and a generated sample from the model, and study the performance of our test on… CONTINUE READING

    Figures and Topics from this paper.

    Explore key concepts

    Links to highly relevant papers for key concepts in this paper:

    Citations

    Publications citing this paper.

    Privacy in Deep Learning: A Survey

    VIEW 1 EXCERPT
    CITES BACKGROUND

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 22 REFERENCES

    Generative Adversarial Nets

    VIEW 1 EXCERPT

    On GANs and GMMs

    VIEW 7 EXCERPTS
    HIGHLY INFLUENTIAL

    Revisiting Classifier Two-Sample Tests

    VIEW 7 EXCERPTS
    HIGHLY INFLUENTIAL

    Improved Techniques for Training GANs

    VIEW 3 EXCERPTS