Corpus ID: 3145203

# Maximal Sparsity with Deep Networks?

@inproceedings{Xin2016MaximalSW,
title={Maximal Sparsity with Deep Networks?},
author={Bo Xin and Yizhou Wang and Wen Gao and David P. Wipf and Baoyuan Wang},
booktitle={NIPS},
year={2016}
}
• Bo Xin, +2 authors Baoyuan Wang
• Published in NIPS 1 April 2016
• Computer Science, Mathematics
The iterations of many sparse estimation algorithms are comprised of a fixed linear filter cascaded with a thresholding nonlinearity, which collectively resemble a typical neural network layer. Consequently, a lengthy sequence of algorithm iterations can be viewed as a deep network with shared, hand-crafted layer weights. It is therefore quite natural to examine the degree to which a learned network model might act as a viable surrogate for traditional sparse estimation in domains where ample… Expand
129 Citations

#### Paper Mentions

Deep Sparse Coding Using Optimized Linear Expansion of Thresholds
• Computer Science, Mathematics
• ArXiv
• 2017
This work addresses the problem of reconstructing sparse signals from noisy and compressive measurements using a feed-forward deep neural network with an architecture motivated by the iterative shrinkage-thresholding algorithm (ISTA), and derives an improved network architecture inspired by FISTA, a faster version of ISTA to achieve similar signal estimation performance. Expand
From Bayesian Sparsity to Gated Recurrent Nets
• 2017
The iterations of many first-order algorithms, when applied to minimizing common regularized regression functions, often resemble neural network layers with prespecified weights. This observation hasExpand
From Bayesian Sparsity to Gated Recurrent Nets
• Computer Science, Mathematics
• NIPS
• 2017
The parallels between latent variable trajectories operating across multiple time-scales during optimization, and the activations within deep network structures designed to adaptively model such characteristic sequences are examined, leading to a novel sparse estimation system that can estimate optimal solutions efficiently in regimes where other algorithms fail. Expand
From Sparse Bayesian Learning to Deep Recurrent Nets
• 2017
where y ∈ R is an observed vector, Φ ∈ Rn×m is some known dictionary of basis vectors with m > n, ‖·‖0 denotes the `0 sparsitypromoting norm, and λ is a trade-off parameter. Although crucial to manyExpand
Optimal deep neural networks for sparse recovery via Laplace techniques
• Mathematics, Computer Science
• ArXiv
• 2017
It is shown that the centroid can be computed analytically by extending a recent result that facilitates the volume computation of polytopes via Laplace transformations, and may serve as a viable initialization to be further optimized and trained using particular input datasets at hand. Expand
Multilayer Convolutional Sparse Modeling: Pursuit and Dictionary Learning
• Computer Science, Mathematics
• IEEE Transactions on Signal Processing
• 2018
This work represents a bridge between matrix factorization, sparse dictionary learning, and sparse autoencoders, and it is shown that the training of the filters is essential to allow for nontrivial signals in the model, and an online algorithm to learn the dictionaries from real data, effectively resulting in cascaded sparse convolutional layers. Expand
Understanding Trainable Sparse Coding with Matrix Factorization
• Computer Science, Mathematics
• ICLR
• 2017
The analysis reveals that a specific matrix factorization of the Gram kernel of the dictionary attempts to nearly diagonalise the kernel with a basis that produces a small perturbation of the $\ell_1$ ball, and proves that the resulting splitting algorithm enjoys an improved convergence bound with respect to the non-adaptive version. Expand
The Sparse Recovery Autoencoder
A new method to learn linear encoders that adapt to data, while still performing well with the widely used $\ell_1$ decoder is presented, based on the insight that unfolding the convex decoder into projected gradient steps can address this issue. Expand
Adaptive Acceleration of Sparse Coding via Matrix Factorization
• Mathematics
• 2016
Sparse coding remains a core building block in many data analysis and machine learning pipelines. Typically it is solved by relying on generic optimization techniques, that are optimal in the classExpand
Generalization bounds for deep thresholding networks
• Computer Science, Mathematics
• ArXiv
• 2020
This work considers compressive sensing in the scenario where the sparsity basis (dictionary) is not known in advance, but needs to be learned from examples, and defines deep networks parametrized by the dictionary, which are called deep thresholding networks. Expand

#### References

SHOWING 1-10 OF 45 REFERENCES
Learning Efficient Sparse and Low Rank Models
• Computer Science, Medicine
• IEEE Transactions on Pattern Analysis and Machine Intelligence
• 2015
A principled way to construct learnable pursuit process architectures for structured sparse and robust low rank models, derived from the iteration of proximal descent algorithms are shown, which learn to approximate the exact parsimonious representation at a fraction of the complexity of the standard optimization methods. Expand
Sparse Estimation with Structured Dictionaries
• D. Wipf
• Computer Science, Mathematics
• NIPS
• 2011
Sparse penalized regression models are analyzed with the purpose of finding, to the extent possible, regimes of dictionary invariant performance, and a Type II Bayesian estimator with a dictionary-dependent sparsity penalty is shown to have a number of desirable invariance properties leading to provable advantages over more conventional penalties. Expand
Learning Deep ℓ0 Encoders
• Computer Science, Mathematics
• AAAI
• 2016
The proposed Deep l0 Encoders enjoy faster inference, larger learning capacity, and better scalability compared to conventional sparse coding solutions, and under task-driven losses, the models can be conveniently optimized from end to end. Expand
Learning Deep $\ell_0$ Encoders
• Computer Science
• 2015
The proposed deep encoders enjoy faster inference, larger learning capacity, and better scalability compared to conventional sparse coding solutions, and under task-driven losses, the models can be conveniently optimized from end to end. Expand
Learning Fast Approximations of Sparse Coding
• Computer Science
• ICML
• 2010
Two versions of a very fast algorithm that produces approximate estimates of the sparse code that can be used to compute good visual features, or to initialize exact iterative algorithms are proposed. Expand
Iterative Thresholding for Sparse Approximations
• Mathematics
• 2008
Sparse signal expansions represent or approximate a signal using a small number of elements from a large collection of elementary waveforms. Finding the optimal sparse expansion is known to be NPExpand
Deep Unfolding: Model-Based Inspiration of Novel Deep Architectures
• Computer Science, Mathematics
• ArXiv
• 2014
This work starts with a model-based approach and an associated inference algorithm, and folds the inference iterations as layers in a deep network, and shows how this framework allows to interpret conventional networks as mean-field inference in Markov random fields, and to obtain new architectures by instead using belief propagation as the inference algorithm. Expand
Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization
• Computer Science, Medicine
• Proceedings of the National Academy of Sciences of the United States of America
• 2003
This article obtains parallel results in a more general setting, where the dictionary D can arise from two or several bases, frames, or even less structured systems, and sketches three applications: separating linear features from planar ones in 3D data, noncooperative multiuser encoding, and identification of over-complete independent component models. Expand
Normalized Iterative Hard Thresholding: Guaranteed Stability and Performance
• Mathematics, Computer Science
• IEEE Journal of Selected Topics in Signal Processing
• 2010
With this modification, empirical evidence suggests that the algorithm is faster than many other state-of-the-art approaches while showing similar performance, and the modified algorithm retains theoretical performance guarantees similar to the original algorithm. Expand
Latent Variable Bayesian Models for Promoting Sparsity
• Computer Science, Mathematics
• IEEE Transactions on Information Theory
• 2011
In coefficient space, the analysis reveals that Type II is exactly equivalent to performing standard MAP estimation using a particular class of dictionary- and noise-dependent, nonfactorial coefficient priors. Expand