Spurious Local Minima are Common in Two-Layer ReLU Neural Networks

@inproceedings{Safran2018SpuriousLM,
  title={Spurious Local Minima are Common in Two-Layer ReLU Neural Networks},
  author={Itay Safran and Ohad Shamir},
  booktitle={ICML},
  year={2018}
}
We consider the optimization problem associated with training simple ReLU neural networks of the form x 7→ ∑k i=1 max{0,w> i x} with respect to the squared loss. We provide a computer-assisted proof that even if the input distribution is standard Gaussian, even if the dimension is arbitrarily large, and even if the target values are generated by such a network, with orthonormal parameter vectors, the problem can still have spurious local minima once 6 ≤ k ≤ 20. By a concentration of measure… CONTINUE READING
Highly Cited
This paper has 21 citations. REVIEW CITATIONS