## One Citation

### First-Order Methods for Nonsmooth Nonconvex Functional Constrained Optimization with or without Slater Points

- Mathematics
- 2022

A simple method that still stably converges on piecewise quadratic SCAD regularized problems despite frequent violations of constraint qualiﬁcation, and the non-Lipschitz analysis of the switching subgradient method analysis appears to be new and may be of independent interest.

## References

SHOWING 1-10 OF 40 REFERENCES

### Breaking the Curse of Dimensionality with Convex Neural Networks

- Computer ScienceJ. Mach. Learn. Res.
- 2017

This work considers neural networks with a single hidden layer and non-decreasing homogeneous activa-tion functions like the rectified linear units and shows that they are adaptive to unknown underlying linear structures, such as the dependence on the projection of the input variables onto a low-dimensional subspace.

### Transformed 𝓁1 Regularization for Learning Sparse Deep Neural Networks

- Computer ScienceNeural Networks
- 2019

### SparseNet: Coordinate Descent With Nonconvex Penalties

- Computer ScienceJournal of the American Statistical Association
- 2011

The properties of penalties suitable for this approach are characterized, their corresponding threshold functions are studied, and a df-standardizing reparametrization is described that assists the pathwise algorithm.

### Approximation Hardness for A Class of Sparse Optimization Problems

- Computer ScienceJ. Mach. Learn. Res.
- 2019

It is proved that finding an O(n1d2)-optimal solution to an n × d problem is strongly NP-hard for any c1, c2 ∈ [0, 1) such that c1 + c2 < 1.

### Neural Network with Unbounded Activations is Universal Approximator

- Computer Science, MathematicsArXiv
- 2015

### Analysis of Multi-stage Convex Relaxation for Sparse Regularization

- Computer ScienceJ. Mach. Learn. Res.
- 2010

A multi-stage convex relaxation scheme for solving problems with non-convex objective functions with sparse regularization is presented and it is shown that the local solution obtained by this procedure is superior to the global solution of the standard L1 conveX relaxation for learning sparse targets.

### The Role of Neural Network Activation Functions

- Computer ScienceIEEE Signal Processing Letters
- 2020

It is shown how neural network training problems are related to infinite-dimensional optimizations posed over Banach spaces of functions whose solutions are well-known to be fractional and polynomial splines, where the particular Banach space (which controls the order of the spline) depends on the choice of activation function.

### Banach Space Representer Theorems for Neural Networks and Ridge Splines

- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2021

A variational framework to understand the properties of the functions learned by neural networks fit to data and a representer theorem showing that finite-width, single-hidden layer neural networks are solutions to inverse problems with total variation-like regularization is derived.

### OPTIMAL COMPUTATIONAL AND STATISTICAL RATES OF CONVERGENCE FOR SPARSE NONCONVEX LEARNING PROBLEMS.

- Computer Science, MathematicsAnnals of statistics
- 2014

These results show that the final estimator attains an oracle statistical property due to the usage of nonconvex penalty, and improves upon existing results by providing a more refined sample complexity bound as well as an exact support recovery result for the final estimation.

### How do infinite width bounded norm networks look in function space?

- Computer Science, MathematicsCOLT
- 2019

The question of what functions can be captured by ReLU networks with an unbounded number of units, but where the overall network Euclidean norm is bounded is considered; or equivalently what is the minimal norm required to approximate a given function.