• Corpus ID: 208637440

The Search for Sparse, Robust Neural Networks

@article{Cosentino2019TheSF,
  title={The Search for Sparse, Robust Neural Networks},
  author={Justin Cosentino and Federico Zaiter and Dan Pei and Jun Zhu},
  journal={ArXiv},
  year={2019},
  volume={abs/1912.02386}
}
Recent work on deep neural network pruning has shown there exist sparse subnetworks that achieve equal or improved accuracy, training time, and loss using fewer network parameters when compared to their dense counterparts. Orthogonal to pruning literature, deep neural networks are known to be susceptible to adversarial examples, which may pose risks in security- or safety-critical applications. Intuition suggests that there is an inherent trade-off between sparsity and robustness such that… 

Achieving adversarial robustness via sparsity

TLDR
This work theoretically prove that the sparsity of network weights is closely associated with model robustness, and proposes a novel adversarial training method called inverse weights inheritance, which imposes sparse weights distribution on a large network by inheriting weights from a small network, thereby improving the robustness of the large network.

Finding Dynamics Preserving Adversarial Winning Tickets

TLDR
This work systematically study the dynamics of adversarial training and proves the existence of trainable sparse sub- network at initialization which can be trained to be adversarial robust from scratch and refers to such sub-network structure as Adversarial Winning Ticket (AWT).

Proving the Strong Lottery Ticket Hypothesis for Convolutional Neural Networks

TLDR
This work shows that, with high probability, it is possible to approximate any CNN by pruning a random CNN whose size is larger by a logarithmic factor.

UvA-DARE (Digital Academic Repository) Pruning via Iterative Ranking of Sensitivity Statistics

TLDR
‘SNIP-it’ is introduced, and it is demonstrated how it can be applied for both structured and unstructured pruning, before and/or during training, therewith achieving state-of-the-art sparsity-performance trade-offs.

Speeding-up pruning for Artificial Neural Networks: Introducing Accelerated Iterative Magnitude Pruning

TLDR
It is shown that, for a limited setting, if targeting high overall sparsity rates, this time can be effectively reduced for each iteration, save for the last one, while yielding a final product whose performance is comparable to the ANN obtained using the existing method.

Spatio-Temporal Sparsification for General Robust Graph Convolution Networks

TLDR
This work proposes to defend the adversarial attacks on GNN through applying the Spatio-Temporal sparsification (called ST-Sparse) on the GNN hidden node representation, similar to the Dropout regularization in spirit.

P ROVING THE S TRONG L OTTERY T ICKET H YPOTHESIS FOR C ONVOLUTIONAL N EURAL N ETWORKS

TLDR
This work shows that, with high probability, it is possible to approximate any CNN by pruning a random CNN whose size is larger by a logarithmic factor.

Not All Parameters Should Be Treated Equally: Deep Safe Semi-supervised Learning under Class Distribution Mismatch

TLDR
Safe Parameter Learning (SPL) is proposed to discover safe parameters and make the harmful parameters inactive, such that it can mitigate the adverse effects caused by unseen-class data.

Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient

TLDR
This work closes the gap and offers an exponential improvement to the over-parameterization requirement for the existence of lottery tickets and shows that any target network of width and depth can be approximated by pruning a random network that is a factor of O(log(dl) wider and twice as deep.

L ONG L IVE THE L OTTERY : T HE E XISTENCE OF W IN NING T ICKETS IN L IFELONG L EARNING

TLDR
This paper demonstrates for the first time that such extremely compact and independently trainable sub-networks can be also identified in the lifelong learning scenario, and introduces lottery teaching that further overcomes forgetting via knowledge distillation aided by external unlabeled data.

References

SHOWING 1-10 OF 25 REFERENCES

Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability

TLDR
It is demonstrated that improving weight sparsity alone already enables us to turn computationally intractable verification problems into tractable ones and improving ReLU stability leads to an additional 4-13x speedup in verification times.

Towards Evaluating the Robustness of Neural Networks

TLDR
It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.

Towards Deep Learning Models Resistant to Adversarial Attacks

TLDR
This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.

Explaining and Harnessing Adversarial Examples

TLDR
It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.

Rethinking the Value of Network Pruning

TLDR
It is found that with optimal learning rate, the "winning ticket" initialization as used in Frankle & Carbin (2019) does not bring improvement over random initialization, and the need for more careful baseline evaluations in future research on structured pruning methods is suggested.

Intriguing properties of neural networks

TLDR
It is found that there is no distinction between individual highlevel units and random linear combinations of high level units, according to various methods of unit analysis, and it is suggested that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.

The Limitations of Deep Learning in Adversarial Settings

TLDR
This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.

Learning both Weights and Connections for Efficient Neural Network

TLDR
A method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections, and prunes redundant connections using a three-step method.

A Survey of Model Compression and Acceleration for Deep Neural Networks

TLDR
This paper survey the recent advanced techniques for compacting and accelerating CNNs model developed, roughly categorized into four schemes: parameter pruning and sharing, low-rank factorization, transferred/compact convolutional filters, and knowledge distillation.

Stabilizing the Lottery Ticket Hypothesis

TLDR
This paper modifications IMP to search for subnetworks that could have been obtained by pruning early in training rather than at iteration 0, and studies subnetwork "stability," finding that - as accuracy improves in this fashion - IMP subnets train to parameters closer to those of the full network and do so with improved consistency in the face of gradient noise.