Corpus ID: 202773824

Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates

@article{Negrea2019InformationTheoreticGB,
  title={Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates},
  author={Jeffrey Negrea and Mahdi Haghifam and G. Dziugaite and Ashish Khisti and D. Roy},
  journal={ArXiv},
  year={2019},
  volume={abs/1911.02151}
}
  • Jeffrey Negrea, Mahdi Haghifam, +2 authors D. Roy
  • Published 2019
  • Mathematics, Computer Science
  • ArXiv
  • In this work, we improve upon the stepwise analysis of noisy iterative learning algorithms initiated by Pensia, Jog, and Loh (2018) and recently extended by Bu, Zou, and Veeravalli (2019). Our main contributions are significantly improved mutual information bounds for Stochastic Gradient Langevin Dynamics via data-dependent estimates. Our approach is based on the variational characterization of mutual information and the use of data-dependent priors that forecast the mini-batch gradient based… CONTINUE READING
    16 Citations

    Figures and Topics from this paper.

    Explore Further: Topics Discussed in This Paper

    Nonvacuous Loss Bounds with Fast Rates for Neural Networks via Conditional Information Measures
    On Random Subset Generalization Error Bounds and the Stochastic Gradient Langevin Dynamics Algorithm
    Tightening Mutual Information-Based Bounds on Generalization Error
    • 4
    • PDF
    Generalization Bounds via Information Density and Conditional Information Density
    • 6
    • PDF
    Shape Matters: Understanding the Implicit Bias of the Noise Covariance
    • 3
    • PDF
    Tightening Mutual Information Based Bounds on Generalization Error
    • 24
    On the role of data in PAC-Bayes bounds
    • 4
    • PDF
    Reasoning About Generalization via Conditional Mutual Information
    • 10
    • PDF
    Conditional Mutual Information Bound for Meta Generalization Gap

    References

    SHOWING 1-10 OF 36 REFERENCES
    On Generalization Error Bounds of Noisy Gradient Methods for Non-Convex Learning
    • 8
    • PDF
    Information matrices and generalization
    • 11
    Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints
    • 46
    • Highly Influential
    • PDF
    Generalization Error Bounds for Noisy, Iterative Algorithms
    • 24
    • Highly Influential
    • PDF
    Entropy-SGD optimizes the prior of a PAC-Bayes bound: Data-dependent PAC-Bayes priors via differential privacy
    • 39
    • PDF
    Train faster, generalize better: Stability of stochastic gradient descent
    • 496
    • PDF
    Tightening Mutual Information Based Bounds on Generalization Error
    • 24
    • Highly Influential
    Chaining Mutual Information and Tightening Generalization Bounds
    • 32
    • PDF
    Information-theoretic analysis of stability and bias of learning algorithms
    • 34
    • PDF