Nonconvex regularization for sparse neural networks

@article{Pieper2020NonconvexRF,
  title={Nonconvex regularization for sparse neural networks},
  author={Konstantin Pieper and Armenak Petrosyan},
  journal={Applied and Computational Harmonic Analysis},
  year={2020}
}
  • K. PieperA. Petrosyan
  • Published 24 April 2020
  • Computer Science, Mathematics
  • Applied and Computational Harmonic Analysis
1 Citations

Figures and Tables from this paper

First-Order Methods for Nonsmooth Nonconvex Functional Constrained Optimization with or without Slater Points

A simple method that still stably converges on piecewise quadratic SCAD regularized problems despite frequent violations of constraint qualification, and the non-Lipschitz analysis of the switching subgradient method analysis appears to be new and may be of independent interest.

References

SHOWING 1-10 OF 40 REFERENCES

Breaking the Curse of Dimensionality with Convex Neural Networks

  • F. Bach
  • Computer Science
    J. Mach. Learn. Res.
  • 2017
This work considers neural networks with a single hidden layer and non-decreasing homogeneous activa-tion functions like the rectified linear units and shows that they are adaptive to unknown underlying linear structures, such as the dependence on the projection of the input variables onto a low-dimensional subspace.

SparseNet: Coordinate Descent With Nonconvex Penalties

The properties of penalties suitable for this approach are characterized, their corresponding threshold functions are studied, and a df-standardizing reparametrization is described that assists the pathwise algorithm.

Approximation Hardness for A Class of Sparse Optimization Problems

It is proved that finding an O(n1d2)-optimal solution to an n × d problem is strongly NP-hard for any c1, c2 ∈ [0, 1) such that c1 + c2 < 1.

Analysis of Multi-stage Convex Relaxation for Sparse Regularization

  • Tong Zhang
  • Computer Science
    J. Mach. Learn. Res.
  • 2010
A multi-stage convex relaxation scheme for solving problems with non-convex objective functions with sparse regularization is presented and it is shown that the local solution obtained by this procedure is superior to the global solution of the standard L1 conveX relaxation for learning sparse targets.

The Role of Neural Network Activation Functions

It is shown how neural network training problems are related to infinite-dimensional optimizations posed over Banach spaces of functions whose solutions are well-known to be fractional and polynomial splines, where the particular Banach space (which controls the order of the spline) depends on the choice of activation function.

Banach Space Representer Theorems for Neural Networks and Ridge Splines

A variational framework to understand the properties of the functions learned by neural networks fit to data and a representer theorem showing that finite-width, single-hidden layer neural networks are solutions to inverse problems with total variation-like regularization is derived.

OPTIMAL COMPUTATIONAL AND STATISTICAL RATES OF CONVERGENCE FOR SPARSE NONCONVEX LEARNING PROBLEMS.

These results show that the final estimator attains an oracle statistical property due to the usage of nonconvex penalty, and improves upon existing results by providing a more refined sample complexity bound as well as an exact support recovery result for the final estimation.

How do infinite width bounded norm networks look in function space?

The question of what functions can be captured by ReLU networks with an unbounded number of units, but where the overall network Euclidean norm is bounded is considered; or equivalently what is the minimal norm required to approximate a given function.