# Maximal Sparsity with Deep Networks?

@inproceedings{Xin2016MaximalSW, title={Maximal Sparsity with Deep Networks?}, author={Bo Xin and Yizhou Wang and Wen Gao and David P. Wipf and Baoyuan Wang}, booktitle={NIPS}, year={2016} }

The iterations of many sparse estimation algorithms are comprised of a fixed linear filter cascaded with a thresholding nonlinearity, which collectively resemble a typical neural network layer. Consequently, a lengthy sequence of algorithm iterations can be viewed as a deep network with shared, hand-crafted layer weights. It is therefore quite natural to examine the degree to which a learned network model might act as a viable surrogate for traditional sparse estimation in domains where ample… Expand

#### Figures, Tables, and Topics from this paper

#### Paper Mentions

#### 129 Citations

Deep Sparse Coding Using Optimized Linear Expansion of Thresholds

- Computer Science, Mathematics
- ArXiv
- 2017

This work addresses the problem of reconstructing sparse signals from noisy and compressive measurements using a feed-forward deep neural network with an architecture motivated by the iterative shrinkage-thresholding algorithm (ISTA), and derives an improved network architecture inspired by FISTA, a faster version of ISTA to achieve similar signal estimation performance. Expand

From Bayesian Sparsity to Gated Recurrent Nets

- 2017

The iterations of many first-order algorithms, when applied to minimizing common regularized regression functions, often resemble neural network layers with prespecified weights. This observation has… Expand

From Bayesian Sparsity to Gated Recurrent Nets

- Computer Science, Mathematics
- NIPS
- 2017

The parallels between latent variable trajectories operating across multiple time-scales during optimization, and the activations within deep network structures designed to adaptively model such characteristic sequences are examined, leading to a novel sparse estimation system that can estimate optimal solutions efficiently in regimes where other algorithms fail. Expand

From Sparse Bayesian Learning to Deep Recurrent Nets

- 2017

where y ∈ R is an observed vector, Φ ∈ Rn×m is some known dictionary of basis vectors with m > n, ‖·‖0 denotes the `0 sparsitypromoting norm, and λ is a trade-off parameter. Although crucial to many… Expand

Optimal deep neural networks for sparse recovery via Laplace techniques

- Mathematics, Computer Science
- ArXiv
- 2017

It is shown that the centroid can be computed analytically by extending a recent result that facilitates the volume computation of polytopes via Laplace transformations, and may serve as a viable initialization to be further optimized and trained using particular input datasets at hand. Expand

Multilayer Convolutional Sparse Modeling: Pursuit and Dictionary Learning

- Computer Science, Mathematics
- IEEE Transactions on Signal Processing
- 2018

This work represents a bridge between matrix factorization, sparse dictionary learning, and sparse autoencoders, and it is shown that the training of the filters is essential to allow for nontrivial signals in the model, and an online algorithm to learn the dictionaries from real data, effectively resulting in cascaded sparse convolutional layers. Expand

Understanding Trainable Sparse Coding with Matrix Factorization

- Computer Science, Mathematics
- ICLR
- 2017

The analysis reveals that a specific matrix factorization of the Gram kernel of the dictionary attempts to nearly diagonalise the kernel with a basis that produces a small perturbation of the $\ell_1$ ball, and proves that the resulting splitting algorithm enjoys an improved convergence bound with respect to the non-adaptive version. Expand

The Sparse Recovery Autoencoder

- Computer Science
- ArXiv
- 2018

A new method to learn linear encoders that adapt to data, while still performing well with the widely used $\ell_1$ decoder is presented, based on the insight that unfolding the convex decoder into projected gradient steps can address this issue. Expand

Adaptive Acceleration of Sparse Coding via Matrix Factorization

- Mathematics
- 2016

Sparse coding remains a core building block in many data analysis and machine learning pipelines. Typically it is solved by relying on generic optimization techniques, that are optimal in the class… Expand

Generalization bounds for deep thresholding networks

- Computer Science, Mathematics
- ArXiv
- 2020

This work considers compressive sensing in the scenario where the sparsity basis (dictionary) is not known in advance, but needs to be learned from examples, and defines deep networks parametrized by the dictionary, which are called deep thresholding networks. Expand

#### References

SHOWING 1-10 OF 45 REFERENCES

Learning Efficient Sparse and Low Rank Models

- Computer Science, Medicine
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- 2015

A principled way to construct learnable pursuit process architectures for structured sparse and robust low rank models, derived from the iteration of proximal descent algorithms are shown, which learn to approximate the exact parsimonious representation at a fraction of the complexity of the standard optimization methods. Expand

Sparse Estimation with Structured Dictionaries

- Computer Science, Mathematics
- NIPS
- 2011

Sparse penalized regression models are analyzed with the purpose of finding, to the extent possible, regimes of dictionary invariant performance, and a Type II Bayesian estimator with a dictionary-dependent sparsity penalty is shown to have a number of desirable invariance properties leading to provable advantages over more conventional penalties. Expand

Learning Deep ℓ0 Encoders

- Computer Science, Mathematics
- AAAI
- 2016

The proposed Deep l0 Encoders enjoy faster inference, larger learning capacity, and better scalability compared to conventional sparse coding solutions, and under task-driven losses, the models can be conveniently optimized from end to end. Expand

Learning Deep $\ell_0$ Encoders

- Computer Science
- 2015

The proposed deep encoders enjoy faster inference, larger learning capacity, and better scalability compared to conventional sparse coding solutions, and under task-driven losses, the models can be conveniently optimized from end to end. Expand

Learning Fast Approximations of Sparse Coding

- Computer Science
- ICML
- 2010

Two versions of a very fast algorithm that produces approximate estimates of the sparse code that can be used to compute good visual features, or to initialize exact iterative algorithms are proposed. Expand

Iterative Thresholding for Sparse Approximations

- Mathematics
- 2008

Sparse signal expansions represent or approximate a signal using a small number of elements from a large collection of elementary waveforms. Finding the optimal sparse expansion is known to be NP… Expand

Deep Unfolding: Model-Based Inspiration of Novel Deep Architectures

- Computer Science, Mathematics
- ArXiv
- 2014

This work starts with a model-based approach and an associated inference algorithm, and folds the inference iterations as layers in a deep network, and shows how this framework allows to interpret conventional networks as mean-field inference in Markov random fields, and to obtain new architectures by instead using belief propagation as the inference algorithm. Expand

Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization

- Computer Science, Medicine
- Proceedings of the National Academy of Sciences of the United States of America
- 2003

This article obtains parallel results in a more general setting, where the dictionary D can arise from two or several bases, frames, or even less structured systems, and sketches three applications: separating linear features from planar ones in 3D data, noncooperative multiuser encoding, and identification of over-complete independent component models. Expand

Normalized Iterative Hard Thresholding: Guaranteed Stability and Performance

- Mathematics, Computer Science
- IEEE Journal of Selected Topics in Signal Processing
- 2010

With this modification, empirical evidence suggests that the algorithm is faster than many other state-of-the-art approaches while showing similar performance, and the modified algorithm retains theoretical performance guarantees similar to the original algorithm. Expand

Latent Variable Bayesian Models for Promoting Sparsity

- Computer Science, Mathematics
- IEEE Transactions on Information Theory
- 2011

In coefficient space, the analysis reveals that Type II is exactly equivalent to performing standard MAP estimation using a particular class of dictionary- and noise-dependent, nonfactorial coefficient priors. Expand