Share This Author
Optimization with Sparsity-Inducing Penalties
- F. Bach, Rodolphe Jenatton, J. Mairal, G. Obozinski
- Computer ScienceFound. Trends Mach. Learn.
- 3 August 2011
This monograph covers proximal methods, block-coordinate descent, reweighted l2-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provides an extensive set of experiments to compare various algorithms from a computational point of view.
Structured Variable Selection with Sparsity-Inducing Norms
This work considers the empirical risk minimization problem for linear supervised learning, with regularization by structured sparsity-inducing norms defined as sums of Euclidean norms on certain subsets of variables, and explores the relationship between groups defining the norm and the resulting nonzero patterns.
Structured Sparse Principal Component Analysis
We present an extension of sparse PCA, or sparse dictionary learning, where the sparsity patterns of all dictionary elements are structured and constrained to belong to a prespecified set of shapes.…
Proximal Methods for Hierarchical Sparse Coding
- Rodolphe Jenatton, J. Mairal, G. Obozinski, F. Bach
- Computer ScienceJ. Mach. Learn. Res.
- 11 September 2010
The procedure has a complexity linear, or close to linear, in the number of atoms, and allows the use of accelerated gradient techniques to solve the tree-structured sparse approximation problem at the same computational cost as traditional ones using the l1-norm.
A latent factor model for highly multi-relational data
This paper proposes a method for modeling large multi-relational datasets, with possibly thousands of relations, based on a bilinear structure, which captures various orders of interaction of the data and also shares sparse latent factors across different relations.
Proximal Methods for Sparse Hierarchical Dictionary Learning
This work considers a tree-structured sparse regularization to learn dictionaries embedded in a hierarchy, thus providing a competitive alternative to probabilistic topic models.
Convex optimization with sparsity-inducing norms
How Good is the Bayes Posterior in Deep Neural Networks Really?
This work demonstrates through careful MCMC sampling that the posterior predictive induced by the Bayes posterior yields systematically worse predictions compared to simpler methods including point estimates obtained from SGD and argues that it is timely to focus on understanding the origin of the improved performance of cold posteriors.
Training independent subnetworks for robust prediction
This work shows that, using a multi-input multi-output (MIMO) configuration, one can utilize a single model's capacity to train multiple subnetworks that independently learn the task at hand, and improves model robustness without increasing compute.
Scalable Hyperparameter Transfer Learning
This work proposes a multi-task adaptive Bayesian linear regression model for transfer learning in BO, whose complexity is linear in the function evaluations: one Bayesianlinear regression model is associated to each black-box function optimization problem (or task), while transfer learning is achieved by coupling the models through a shared deep neural net.