# Randomly Initialized One-Layer Neural Networks Make Data Linearly Separable

@article{Ghosal2022RandomlyIO, title={Randomly Initialized One-Layer Neural Networks Make Data Linearly Separable}, author={Promit Ghosal and Srinath Mahankali and Yihang Sun}, journal={ArXiv}, year={2022}, volume={abs/2205.11716} }

Recently, neural networks have been shown to perform exceptionally well in trans-forming two arbitrary sets into two linearly separable sets. Doing this with a randomly initialized neural network is of immense interest because the associated computation is cheaper than using fully trained networks. In this paper, we show that, with sufﬁcient width, a randomly initialized one-layer neural network transforms two sets into two linearly separable sets with high probability. Furthermore, we provide…

## References

SHOWING 1-10 OF 48 REFERENCES

### The Separation Capacity of Random Neural Networks

- Computer ScienceArXiv
- 2021

It is shown that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve the data separation problem under what conditions can a random neural network make two classes X−,X+ linearly separable.

### On the Power and Limitations of Random Features for Understanding Neural Networks

- Computer ScienceNeurIPS
- 2019

This paper rigorously show that random features cannot be used to learn even a single ReLU neuron with standard Gaussian inputs, unless the network size is exponentially large, and concludes that a single neuron is learnable with gradient-based methods.

### Random Vector Functional Link Networks for Function Approximation on Manifolds

- Computer Science, MathematicsArXiv
- 2020

A (corrected) rigorous proof that the Igelnik and Pao construction is a universal approximator for continuous functions on compact domains, with approximation error decaying asymptotically like $O(1/\sqrt{n})$ for the number of network nodes, is provided.

### Identity Matters in Deep Learning

- Computer ScienceICLR
- 2017

This work gives a strikingly simple proof that arbitrarily deep linear residual networks have no spurious local optima and shows that residual networks with ReLu activations have universal finite-sample expressivity in the sense that the network can represent any function of its sample provided that the model has more parameters than the sample size.

### ImageNet classification with deep convolutional neural networks

- Computer ScienceCommun. ACM
- 2012

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

### Understanding deep learning requires rethinking generalization

- Computer ScienceICLR
- 2017

These experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data, and confirm that simple depth two neural networks already have perfect finite sample expressivity.

### Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions

- Computer ScienceIEEE Trans. Neural Networks
- 1998

This paper rigorously proves that standard single-hidden layer feedforward networks with at most N hidden neurons and with any bounded nonlinear activation function which has a limit at one infinity can learn N distinct samples with zero error.

### Learning capability and storage capacity of two-hidden-layer feedforward networks

- Computer ScienceIEEE Trans. Neural Networks
- 2003

This paper rigorously proves in a constructive method that two-hidden-layer feedforward networks (TLFNs) with 2/spl radic/(m+2)N (/spl Lt/N) hidden neurons can learn any N distinct samples with any arbitrarily small error, where m is the required number of output neurons.