Share This Author
Equality of Opportunity in Supervised Learning
This work proposes a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features and shows how to optimally adjust any learned predictor so as to remove discrimination according to this definition.
Pegasos: primal estimated sub-gradient solver for SVM
A simple and effective stochastic sub-gradient descent algorithm for solving the optimization problem cast by Support Vector Machines, which is particularly well suited for large text classification problems, and demonstrates an order-of-magnitude speedup over previous SVM learning methods.
Maximum-Margin Matrix Factorization
A novel approach to collaborative prediction is presented, using low-norm instead of low-rank factorizations, inspired by, and has strong connections to, large-margin linear discrimination.
Fast maximum margin matrix factorization for collaborative prediction
This work investigates a direct gradient-based optimization method for MMMF and finds that MMMf substantially outperforms all nine methods he tested and demonstrates it on large collaborative prediction problems.
The Implicit Bias of Gradient Descent on Separable Data
- Daniel Soudry, Elad Hoffer, Suriya Gunasekar, Nathan Srebro
- Computer ScienceJ. Mach. Learn. Res.
- 27 October 2017
We examine gradient descent on unregularized logistic regression problems, with homogeneous linear predictors on linearly separable datasets. We show the predictor converges to the direction of the…
The Marginal Value of Adaptive Gradient Methods in Machine Learning
It is observed that the solutions found by adaptive methods generalize worse (often significantly worse) than SGD, even when these solutions have better training performance, suggesting that practitioners should reconsider the use of adaptive methods to train neural networks.
Exploring Generalization in Deep Learning
- Behnam Neyshabur, Srinadh Bhojanapalli, D. McAllester, Nathan Srebro
- Computer ScienceNIPS
- 27 June 2017
This work considers several recently suggested explanations for what drives generalization in deep networks, including norm-based control, sharpness and robustness, and investigates how these measures explain different observed phenomena.
Communication-Efficient Distributed Optimization using an Approximate Newton-type Method
A novel Newton-type method for distributed optimization, which is particularly well suited for stochastic optimization and learning problems, and which enjoys a linear rate of convergence which provably improves with the data size.
Weighted Low-Rank Approximations
This work provides a simple and efficient algorithm for solving weighted low-rank approximation problems, which, unlike their unweighted version, do not admit a closed-form solution in general.
A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks
- Behnam Neyshabur, Srinadh Bhojanapalli, David A. McAllester, Nathan Srebro
- Computer Science, MathematicsICLR
- 29 July 2017
A generalization bound for feedforward neural networks is presented in terms of the product of the spectral norm of the layers and the Frobeniusnorm of the weights using a PAC-Bayes analysis.