Corpus ID: 220265581

Training highly effective connectivities within neural networks with randomly initialized, fixed weights

@article{Ivan2020TrainingHE,
  title={Training highly effective connectivities within neural networks with randomly initialized, fixed weights},
  author={Cristian Ivan and R. Florian},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.16627}
}
We present some novel, straightforward methods for training the connection graph of a randomly initialized neural network without training the weights. These methods do not use hyperparameters defining cutoff thresholds and therefore remove the need for iteratively searching optimal values of such hyperparameters. We can achieve similar or higher performances than in the case of training all weights, with a similar computational cost as for standard training techniques. Besides switching… Expand
Rescaling CNN through Learnable Repetition of Network Parameters
TLDR
It is shown that small base networks when rescaled, can provide performance comparable to deeper networks with as low as 6% of optimization parameters of the deeper one, and up to 40% of the relative gain reported by state-of-the-art methods for rotation equivariance could actually be due to just the learnt repetition of weights. Expand

References

SHOWING 1-10 OF 22 REFERENCES
Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization
TLDR
This work suggests that exploring structural degrees of freedom during training is more effective than adding extra parameters to the network, and outperforms previous static and dynamic reparameterization methods, yielding the best accuracy for a fixed parameter budget. Expand
Understanding the difficulty of training deep feedforward neural networks
TLDR
The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future. Expand
SNIP: Single-shot Network Pruning based on Connection Sensitivity
TLDR
This work presents a new approach that prunes a given network once at initialization prior to training, and introduces a saliency criterion based on connection sensitivity that identifies structurally important connections in the network for the given task. Expand
Understanding deep learning requires rethinking generalization
TLDR
These experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data, and confirm that simple depth two neural networks already have perfect finite sample expressivity. Expand
What’s Hidden in a Randomly Weighted Neural Network?
TLDR
It is empirically show that as randomly weighted neural networks with fixed weights grow wider and deeper, an ``untrained subnetwork" approaches a network with learned weights in accuracy. Expand
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
TLDR
This work finds that dense, randomly-initialized, feed-forward networks contain subnetworks ("winning tickets") that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations, and articulate the "lottery ticket hypothesis". Expand
Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers
TLDR
The authors' dynamic sparse training algorithm can easily train very sparse neural network models with little performance loss using the same training epochs as dense models and reveals the underlying problems of traditional three-stage pruning algorithms. Expand
NeST: A Neural Network Synthesis Tool Based on a Grow-and-Prune Paradigm
TLDR
A network growth algorithm that complements network pruning to learn both weights and compact DNN architectures during training, and delivers significant additional parameter and FLOPs reduction relative to pruning-only methods. Expand
Estimating or Propagating Gradients Through Stochastic Neurons
TLDR
It is demonstrated that a simple biologically plausible formula gives rise to an an unbiased (but noisy) estimator of the gradient with respect to a binary stochastic neuron firing proba bility, and an approach to approximating this unbiased but high-variance estimator by learning to predict it using a biased estimator. Expand
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
TLDR
This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities. Expand
...
1
2
3
...