Corpus ID: 236956390

Unified Regularity Measures for Sample-wise Learning and Generalization

  title={Unified Regularity Measures for Sample-wise Learning and Generalization},
  author={Chi Zhang and Xiaoning Ma and Yu Liu and Le Wang and Yuanqi Su and Yuehu Liu},
  • Chi Zhang, Xiaoning Ma, +3 authors Yuehu Liu
  • Published 9 August 2021
  • Computer Science
  • ArXiv
Fundamental machine learning theory shows that different samples contribute unequally both in learning and testing processes. Contemporary studies on DNN imply that such sample difference is rooted on the distribution of intrinsic pattern information, namely sample regularity. Motivated by the recent discovery on network memorization and generalization, we proposed a pair of sample regularity measures for both processes with a formulation-consistent representation. Specifically, cumulative… Expand


Exploring the Memorization-Generalization Continuum in Deep Learning
This work analyzes how individual instances are treated by a model on the memorizationgeneralization continuum via a consistency score, and explores three proxies to the consistency score: kernel density estimation on input and hidden representations; and the time course of training, i.e., learning speed. Expand
Does learning require memorization? a short tale about a long tail
The model allows to quantify the effect of not fitting the training data on the generalization performance of the learned classifier and demonstrates that memorization is necessary whenever frequencies are long-tailed, and establishes a formal link between these empirical phenomena. Expand
An Empirical Study of Example Forgetting during Deep Neural Network Learning
It is found that certain examples are forgotten with high frequency, and some not at all; a data set’s (un)forgettable examples generalize across neural architectures; and a significant fraction of examples can be omitted from the training data set while still maintaining state-of-the-art generalization performance. Expand
Understanding deep learning requires rethinking generalization
These experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data, and confirm that simple depth two neural networks already have perfect finite sample expressivity. Expand
Data Dropout: Optimizing Training Data for Convolutional Neural Networks
  • Tianyang Wang, Jun Huan, Bo Li
  • Computer Science
  • 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)
  • 2018
It is demonstrated that deep learning models such as convolutional neural networks may not favor all training samples, and generalization accuracy can be further improved by dropping those unfavorable samples,and a Two-Round Training approach is proposed, aiming to achieve higher generalization accuracies. Expand
Overcoming catastrophic forgetting in neural networks
It is shown that it is possible to overcome the limitation of connectionist models and train networks that can maintain expertise on tasks that they have not experienced for a long time and selectively slowing down learning on the weights important for previous tasks. Expand
Not All Samples Are Created Equal: Deep Learning with Importance Sampling
A principled importance sampling scheme is proposed that focuses computation on "informative" examples, and reduces the variance of the stochastic gradients during training, and derives a tractable upper bound to the per-sample gradient norm. Expand
Deep learning generalizes because the parameter-function map is biased towards simple functions
This paper argues that the parameter-function map of many DNNs should be exponentially biased towards simple functions, and provides clear evidence for this strong simplicity bias in a model DNN for Boolean functions, as well as in much larger fully connected and convolutional networks applied to CIFAR10 and MNIST. Expand
Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem
This chapter discusses the catastrophic interference in connectionist networks, and the simulation results demonstrate only that interference is catastrophic in some specific networks. Expand
The unreasonable effectiveness of deep learning in artificial intelligence
  • T. Sejnowski
  • Computer Science, Biology
  • Proceedings of the National Academy of Sciences
  • 2020
Deep learning was inspired by the architecture of the cerebral cortex and insights into autonomy and general intelligence may be found in other brain regions that are essential for planning and survival, but major breakthroughs will be needed to achieve these goals. Expand