Corpus ID: 235458120

Scaling-up Diverse Orthogonal Convolutional Networks with a Paraunitary Framework

@article{Su2021ScalingupDO,
  title={Scaling-up Diverse Orthogonal Convolutional Networks with a Paraunitary Framework},
  author={Jiahao Su and Wonmin Byeon and Furong Huang},
  journal={ArXiv},
  year={2021},
  volume={abs/2106.09121}
}
Enforcing orthogonality in neural networks is an antidote for gradient vanishing/exploding problems, sensitivity by adversarial perturbation, and bounding generalization errors. However, many previous approaches are heuristic, and the orthogonality of convolutional layers is not systematically studied: some of these designs are not exactly orthogonal, while others only consider standard convolutional layers and propose specific classes of their realizations. To address this problem, we propose… Expand
1 Citations

Figures and Tables from this paper

Certified Defense via Latent Space Randomized Smoothing with Orthogonal Encoders
  • Huimin Zeng, Jiahao Su, Furong Huang
  • Computer Science
  • ArXiv
  • 2021
TLDR
This work investigates the possibility of performing randomized smoothing and establishing the robust certification in the latent space of a network, so that the overall dimensionality of tensors involved in computation could be drastically reduced. Expand

References

SHOWING 1-10 OF 50 REFERENCES
Orthogonal Convolutional Neural Networks
TLDR
The proposed orthogonal convolution requires no additional parameters and little computational overhead and consistently outperforms the kernel orthogonality alternative on a wide range of tasks such as image classification and inpainting under supervised, semi-supervised and unsupervised settings. Expand
Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks
TLDR
The Block Convolution Orthogonal Parameterization (BCOP), an expressive parameterization of orthogonal convolution operations, is presented, finding that it is competitive with existing approaches to provable adversarial robustness and Wasserstein distance estimation. Expand
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks
TLDR
This work demonstrates that it is possible to train vanilla CNNs with ten thousand layers or more simply by using an appropriate initialization scheme, and presents an algorithm for generating such random initial orthogonal convolution kernels. Expand
Parseval Networks: Improving Robustness to Adversarial Examples
TLDR
It is shown that Parseval networks match the state-of-the-art in terms of accuracy on CIFAR-10/100 and Street View House Numbers while being more robust than their vanilla counterpart against adversarial examples. Expand
Squeeze-and-Excitation Networks
TLDR
This work proposes a novel architectural unit, which is term the “Squeeze-and-Excitation” (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels and shows that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets. Expand
Scaling provable adversarial defenses
TLDR
This paper presents a technique for extending these training procedures to much more general networks, with skip connections and general nonlinearities, and shows how to further improve robust error through cascade models. Expand
Improving Training of Deep Neural Networks via Singular Value Bounding
TLDR
This work proposes to constrain the solutions of weight matrices in the orthogonal feasible set during the whole process of network training, and achieves this by a simple yet effective method called Singular Value Bounding (SVB). Expand
i-RevNet: Deep Invertible Networks
TLDR
The i-RevNet is built, a network that can be fully inverted up to the final projection onto the classes, i.e. no information is discarded, and linear interpolations between natural image representations are reconstructed. Expand
Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization
TLDR
An efficient parametrization of the transition matrix of an RNN that allows us to stabilize the gradients that arise in its training and empirically solves the vanishing gradient issue to a large extent. Expand
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
TLDR
A new scaling method is proposed that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient and is demonstrated the effectiveness of this method on scaling up MobileNets and ResNet. Expand
...
1
2
3
4
5
...