• Corpus ID: 173990166

Discovering Neural Wirings

@inproceedings{Wortsman2019DiscoveringNW,
  title={Discovering Neural Wirings},
  author={Mitchell Wortsman and Ali Farhadi and Mohammad Rastegari},
  booktitle={Neural Information Processing Systems},
  year={2019}
}
The success of neural networks has driven a shift in focus from feature engineering to architecture engineering. However, successful networks today are constructed using a small and manually defined set of building blocks. Even in methods of neural architecture search (NAS) the network connectivity patterns are largely constrained. In this work we propose a method for discovering neural wirings. We relax the typical notion of layers and instead enable channels to form connections independent of… 

Figures and Tables from this paper

Optimizing Connectivity through Network Gradients for the Restricted Boltzmann Machine

This work presents a method to find optimal connectivity patterns for RBMs based on the idea of network gradients (NCG): computing the gradient of every possible connection, given a specific connection pattern, and using the gradient to drive a continuous connection strength parameter that in turn is used to determine the connection pattern.

Mining the Weights Knowledge for Optimizing Neural Network Structures

Inspired by how learning works in the mammalian brain, a switcher neural network is introduced that uses as inputs the weights of a task-specific neural network (called TNN for short) and mine the knowledge contained in the weights toward automatic architecture learning.

Deconstructing the Structure of Sparse Neural Networks

This work first measures performance when structure persists and weights are reset to a different random initialization, thereby extending experiments in Deconstructing Lottery Tickets, and investigates how early in training the structure emerges.

Revisiting Neural Architecture Search

This paper revisits the fundamental approach to NAS and proposes a novel approach called ReNAS that can search for the complete neural network without much human effort and is a step closer towards AutoML-nirvana.

Understanding the wiring evolution in differentiable neural architecture search

Questions that future differentiable methods for neural wiring discovery need to confront are posed, hoping to evoke a discussion and rethinking on how much bias has been enforced implicitly in existing NAS methods.

Graph Structure of Neural Networks

A novel graph-based representation of neural networks called relational graph is developed, where layers of neural network computation correspond to rounds of message exchange along the graph structure, which shows that a "sweet spot" of relational graphs leads to neural networks with significantly improved predictive performance.

ON THE RELATIONSHIP BETWEEN TOPOLOGY AND GRADIENT PROPAGATION IN DEEP NETWORKS

  • Computer Science
  • 2020
This paper establishes a theoretical link between NN-Mass, a topological property of neural architectures, and gradient flow characteristics and can identify models with similar accuracy, despite having significantly different size/compute requirements.

Neural networks adapting to datasets: learning network size and topology

A flexible setup allowing for a neural network to learn both its size and topology during the course of a standard gradient-based training is introduced, which has the structure of a graph tailored to the particular learning task and dataset.

Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks

The Dynamic Graph Network (DG-Net) is raised, which learns the instance-aware connectivity, which creates different forward paths for different instances of the network, which allows the network to have more representation ability.

Structural Learning in Artificial Neural Networks: A Neural Operator Perspective

This review provides a survey on structural learning methods in deep ANNs, including a new neural operator framework from a cellular neuroscience context and perspective aimed at motivating research on this challenging topic.
...

References

SHOWING 1-10 OF 36 REFERENCES

Exploring Randomly Wired Neural Networks for Image Recognition

The results suggest that new efforts focusing on designing better network generators may lead to new breakthroughs by exploring less constrained search spaces with more room for novel design.

Neural Architecture Search with Reinforcement Learning

This paper uses a recurrent network to generate the model descriptions of neural networks and trains this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set.

Deep Expander Networks: Efficient Deep Networks from Graph Theory

This work proposes to model connections between filters of a CNN using graphs which are simultaneously sparse and well connected, using a well-studied class of graphs from theoretical computer science that satisfies these properties known as Expander graphs.

Learning Implicitly Recurrent CNNs Through Parameter Sharing

A parameter sharing scheme, in which different layers of a convolutional neural network (CNN) are defined by a learned linear combination of parameter tensors from a global bank of templates, which yields a flexible hybridization of traditional CNNs and recurrent networks.

Learning Sparse Networks Using Targeted Dropout

Target dropout is introduced, a method for training a neural network so that it is robust to subsequent pruning, and improves upon more complicated sparsifying regularisers while being simple to implement and easy to tune.

Luck Matters: Understanding Training Dynamics of Deep ReLU Networks

Using a teacher-student setting, a novel relationship between the gradient received by hidden student nodes and the activations of teacher nodes for deep ReLU networks is discovered and it is proved that student nodes whose weights are initialized to be close to teacher nodes converge to them at a faster rate.

Learning Multiple Layers of Features from Tiny Images

It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

This work finds that dense, randomly-initialized, feed-forward networks contain subnetworks ("winning tickets") that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations, and articulate the "lottery ticket hypothesis".

Learning Sparse Neural Networks through L0 Regularization

A practical method for L_0 norm regularization for neural networks: pruning the network during training by encouraging weights to become exactly zero, which allows for straightforward and efficient learning of model structures with stochastic gradient descent and allows for conditional computation in a principled way.

Sparse Networks from Scratch: Faster Training without Losing Performance

This work develops sparse momentum, an algorithm which uses exponentially smoothed gradients (momentum) to identify layers and weights which reduce the error efficiently and shows that the benefits of momentum redistribution and growth increase with the depth and size of the network.