Corpus ID: 52903499

Efficiently testing local optimality and escaping saddles for ReLU networks

@article{Yun2019EfficientlyTL,
  title={Efficiently testing local optimality and escaping saddles for ReLU networks},
  author={Chulhee Yun and S. Sra and A. Jadbabaie},
  journal={ArXiv},
  year={2019},
  volume={abs/1809.10858}
}
  • Chulhee Yun, S. Sra, A. Jadbabaie
  • Published 2019
  • Mathematics, Computer Science
  • ArXiv
  • We provide a theoretical algorithm for checking local optimality and escaping saddles at nondifferentiable points of empirical risks of two-layer ReLU networks. Our algorithm receives any parameter value and returns: local minimum, second-order stationary point, or a strict descent direction. The presence of $M$ data points on the nondifferentiability of the ReLU divides the parameter space into at most $2^M$ regions, which makes analysis difficult. By exploiting polyhedral geometry, we reduce… CONTINUE READING
    7 Citations
    Gradient Descent Provably Optimizes Over-parameterized Neural Networks
    • 487
    • PDF
    Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization
    • 53
    • PDF
    Are deep ResNets provably better than linear predictors?
    • 5
    • PDF
    Nonlinearities in activations substantially shape the loss surfaces of neural networks
    • 1
    • Highly Influenced
    Piecewise linear activations substantially shape the loss surfaces of neural networks
    • 9
    • Highly Influenced
    • PDF
    Distribution System State Estimation Via Data-Driven and Physics-Aware Deep Neural Networks
    National Cultural Symbols Recognition Based on Convolutional Neural Network
    • Huang Zhi-xiong, Shi Zhuo, +4 authors Yu Ke
    • Computer Science
    • 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)
    • 2020

    References

    SHOWING 1-10 OF 41 REFERENCES
    Deep Learning without Poor Local Minima
    • 592
    • PDF
    Gradient Descent Provably Optimizes Over-parameterized Neural Networks
    • 487
    • PDF
    A Generic Approach for Escaping Saddle points
    • 58
    • PDF
    Learning ReLUs via Gradient Descent
    • 111
    • PDF
    A Convergence Theory for Deep Learning via Over-Parameterization
    • 490
    • PDF
    Global optimality conditions for deep neural networks
    • 85
    • PDF
    Spurious Local Minima are Common in Two-Layer ReLU Neural Networks
    • 161
    • PDF
    Recovery Guarantees for One-hidden-layer Neural Networks
    • 216
    • PDF
    Gradient descent optimizes over-parameterized deep ReLU networks
    • 259
    • PDF