Corpus ID: 221083248

Neural Complexity Measures

@article{Lee2020NeuralCM,
  title={Neural Complexity Measures},
  author={Yoonho Lee and Juho Lee and Sung Ju Hwang and Eunho Yang and Seungjin Choi},
  journal={ArXiv},
  year={2020},
  volume={abs/2008.02953}
}
While various complexity measures for deep neural networks exist, specifying an appropriate measure capable of predicting and explaining generalization in deep networks has proven challenging. We propose Neural Complexity (NC), a meta-learning framework for predicting generalization. Our model learns a scalar complexity measure through interactions with many heterogeneous tasks in a data-driven way. The trained NC model can be added to the standard training loss to regularize any task learner… Expand
Understanding and Accelerating Neural Architecture Search with Training-Free and Theory-Grounded Metrics
TLDR
This work presents a unified framework to understand and accelerate NAS, by disentangling “TEG” characteristics of searched networks – Trainability, Expressivity, Generalization – all assessed in a training-free manner, leading to both improved search accuracy and over 2.3× reduction in search time cost. Expand

References

SHOWING 1-10 OF 47 REFERENCES
Towards Task and Architecture-Independent Generalization Gap Predictors
TLDR
Both DNNs and RNNs consistently and significantly outperform linear models and show results for architecture-independent, task- independent, and out-of-distribution generalization gap prediction tasks. Expand
Predicting Neural Network Accuracy from Weights
We show experimentally that the accuracy of a trained neural network can be predicted surprisingly well by looking only at its weights, without evaluating it on input data. We motivate this task andExpand
Fantastic Generalization Measures and Where to Find Them
TLDR
This work presents the first large scale study of generalization in deep networks, investigating more then 40 complexity measures taken from both theoretical bounds and empirical studies and showing surprising failures of some measures as well as promising measures for further research. Expand
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learningExpand
Optimization as a Model for Few-Shot Learning
MetaReg: Towards Domain Generalization using Meta-Regularization
TLDR
Experimental validations on computer vision and natural language datasets indicate that the encoding of the notion of domain generalization using a novel regularization function using a Learning to Learn (or) meta-learning framework can learn regularizers that achieve good cross-domain generalization. Expand
Spectrally-normalized margin bounds for neural networks
TLDR
This bound is empirically investigated for a standard AlexNet network trained with SGD on the mnist and cifar10 datasets, with both original and random labels; the bound, the Lipschitz constants, and the excess risks are all in direct correlation, suggesting both that SGD selects predictors whose complexity scales with the difficulty of the learning task, and that the presented bound is sensitive to this complexity. Expand
Generalization in Deep Networks: The Role of Distance from Initialization
TLDR
Empirical evidences are provided that demonstrate that the model capacity of SGD-trained deep networks is in fact restricted through implicit regularization of the distance from initialization, and theoretical arguments that further highlight the need for initialization-dependent notions of model capacity are highlighted. Expand
Prototypical Networks for Few-shot Learning
TLDR
This work proposes Prototypical Networks for few-shot classification, and provides an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning. Expand
Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data
TLDR
By optimizing the PAC-Bayes bound directly, Langford and Caruana (2001) are able to extend their approach and obtain nonvacuous generalization bounds for deep stochastic neural network classifiers with millions of parameters trained on only tens of thousands of examples. Expand
...
1
2
3
4
5
...