The Singular Values of Convolutional Layers
- Hanie Sedghi, Vineet Gupta, Philip M. Long
- Computer Science, MathematicsInternational Conference on Learning…
- 26 May 2018
It is shown that this is an effective regularizer; for example, it improves the test error of a deep residual network using batch normalization on CIFAR-10 from 6.2\% to 5.3\%.
What is being transferred in transfer learning?
- Behnam Neyshabur, Hanie Sedghi, Chiyuan Zhang
- Computer ScienceNeural Information Processing Systems
- 26 August 2020
Through a series of analyses on transferring to block-shuffled images, the effect of feature reuse from learning low-level statistics of data is separated and it is shown that some benefit of transfer learning comes from the latter.
Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods
- Majid Janzamin, Hanie Sedghi, Anima Anandkumar
- Computer Science
- 1 March 2017
This work proposes a computationally efficient method with guaranteed risk bounds for training neural networks with one hidden layer based on tensor decomposition, which provably converges to the global optimum, under a set of mild non-degeneracy conditions.
Score Function Features for Discriminative Learning: Matrix and Tensor Framework
- Majid Janzamin, Hanie Sedghi, Anima Anandkumar
- Computer ScienceArXiv
- 8 December 2014
This paper considers a novel class of matrix and tensor-valued features, which can be pre- trained using unlabeled samples, and presents efficient algorithms for extracting discriminative information, given these pre-trained features and labeled samples for any related task.
Provable Tensor Methods for Learning Mixtures of Generalized Linear Models
- Hanie Sedghi, Majid Janzamin, Anima Anandkumar
- Computer ScienceInternational Conference on Artificial…
- 9 December 2014
This work considers the problem of learning mixtures of generalized linear models (GLM) which arise in classification and regression problems and presents a tensor decomposition method which is guaranteed to correctly recover the parameters.
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance
- S. Garg, Sivaraman Balakrishnan, Zachary Chase Lipton, Behnam Neyshabur, Hanie Sedghi
- Computer ScienceInternational Conference on Learning…
- 11 January 2022
Average Thresholded Confidence (ATC) is proposed, a practical method that learns a threshold on the model’s confidence, predicting accuracy as the fraction of unlabeled examples for which model confidence exceeds that threshold.
Generalization bounds for deep convolutional neural networks
- Philip M. Long, Hanie Sedghi
- Computer ScienceInternational Conference on Learning…
- 29 May 2019
Borders on the generalization error of convolutional networks are proved in terms of the training loss, the number of parameters, the Lipschitz constant of the loss and the distance from the weights to the initial weights.
The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks
- R. Entezari, Hanie Sedghi, O. Saukh, Behnam Neyshabur
- Mathematics, Computer ScienceInternational Conference on Learning…
- 12 October 2021
If the permutation invariance of neural networks is taken into account, SGD solutions will likely have no barrier in the linear interpolation between them, which has implications for lottery ticket hypothesis, distributed training and ensemble methods.
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers
- Preetum Nakkiran, Behnam Neyshabur, Hanie Sedghi
- Computer ScienceInternational Conference on Learning…
- 2021
Size-free generalization bounds for convolutional neural networks
- Philip M. Long, Hanie Sedghi
- Computer ScienceInternational Conference on Learning…
- 29 May 2019
Borders on the generalization error of convolutional networks are proved in terms of the training loss, the number of parameters, the Lipschitz constant of the loss and the distance from the weights to the initial weights.
...
...