It is shown that ReLU networks trained with standard weight decay are equivalent to block $\ell_1$ penalized convex models and certain standard convolutional linear networks are equivalent semi-definite programs which can be simplified to regularized linear models in a polynomial sized discrete Fourier feature space.Expand

IEEE Transactions on Neural Networks and Learningâ€¦

25 October 2017

TLDR

This work jointly train and optimize the parameters of the LSTM architecture and the OC-SVM (or SVDD) algorithm using highly effective gradient and quadratic programming-based training methods and provides extensions of this unsupervised formulation to the semisupervised and fully supervised frameworks.Expand

It is shown that a set of optimal hidden layer weights for a norm regularized DNN training problem can be explicitly found as the extreme points of a convex set and it is proved that each optimal weight matrix is rank-$K$ and aligns with the previous layers via duality.Expand

A convex analytic framework utilizing semi-infinite duality is developed to obtain equivalent convex optimization problems for several two- and three-layer CNN architectures, and it is proved that two-layerCNNs can be globally optimized via an $\ell_2$ norm regularized convex program.Expand

IEEE Transactions on Neural Networks and Learningâ€¦

1 August 2018

TLDR

This work investigates online nonlinear regression and introduces novel regression structures based on the long short term memory (LSTM) networks by directly replacing the LSTM architecture with the GRU architecture, where the superiority of the introduced algorithms with respect to the conventional methods over several different benchmark real life data sets is illustrated.Expand

IEEE Transactions on Neural Networks and Learningâ€¦

24 October 2017

TLDR

This brief investigates online training of long short term memory (LSTM) architectures in a distributed network of nodes, where each node employs an LSTM-based structure for online regression and introduces a highly effective and efficient distributed particle filtering (DPF)-based training algorithm.Expand

This work demonstrates how neural networks implicitly attempt to solve copositive programs via semi-nonnegative matrix factorization, and describes the first algorithms for provably finding the global minimum of the vector output neural network training problem.Expand

It is proved that the equivalent convex problem can be globally optimized by a standard convex optimization solver with a polynomial-time complexity with respect to the number of samples and data dimension when the width of the network is fixed.Expand

A convex analytic framework for ReLU neural networks is developed which elucidates the inner workings of hidden neurons and their function space characteristics and establishes a connection to $\ell_0$-$\ell_1$ equivalence for neural networks analogous to the minimal cardinality solutions in compressed sensing.Expand

A convex analytic framework for ReLU neural networks is developed which elucidates the inner workings of hidden neurons and their function space characteristics and establishes a connection to `0-`1 equivalence for neural networks analogous to the minimal cardinality solutions in compressed sensing.Expand