Scalable Bayesian Regression in High Dimensions With Multiple Data Sources

  title={Scalable Bayesian Regression in High Dimensions With Multiple Data Sources},
  author={Konstantinos Perrakis and Sach Mukherjee and the Alzheimer’s Disease Neuroimaging Initiative},
  journal={Journal of Computational and Graphical Statistics},
  pages={28 - 39}
Abstract Applications of high-dimensional regression often involve multiple sources or types of covariates. We propose methodology for this setting, emphasizing the “wide data” regime with large total dimensionality p and sample size . We focus on a flexible ridge-type prior with shrinkage levels that are specific to each data type or source and that are set automatically by empirical Bayes. All estimation, including setting of shrinkage levels, is formulated mainly in terms of inner product… Expand
High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking
A large-scale comparison of penalized regression methods is presented, with no unambiguous winner across all scenarios or goals, even in this restricted setting where all data align well with the assumptions underlying the methods. Expand
Fast Cross-validation for Multi-penalty High-dimensional Ridge Regression
High-dimensional prediction with multiple data types needs to account for potentially strong differences in predictive signal. Ridge regression is a simple model for high-dimensional data that has ...
Fast cross-validation for multi-penalty ridge regression
A very flexible framework that includes prediction of several types of response, allows for unpenalized covariates, can optimize several performance criteria and implements repeated CV is developed. Expand


Consistent High-Dimensional Bayesian Variable Selection via Penalized Credible Regions
  • H. Bondell, B. Reich
  • Computer Science, Medicine
  • Journal of the American Statistical Association
  • 2012
This work proposes a conjugate prior only on the full model parameters and use sparse solutions within posterior credible regions to perform selection, and shows that these sparse solutions can be computed via existing algorithms. Expand
An Information Matrix Prior for Bayesian Analysis in Generalized Linear Models with High Dimensional Data.
A novel specification for a general class of prior distributions, called Information Matrix (IM) priors, for high-dimensional generalized linear models, based on a broad generalization of Zellner's g-prior for Gaussian linear models is developed. Expand
A Sparse-Group Lasso
For high-dimensional supervised learning problems, often using problem-specific assumptions can lead to greater accuracy. For problems with grouped covariates, which are believed to have sparseExpand
Penalized regression, standard errors, and Bayesian lassos
Penalized regression methods for simultaneous variable selection and coe-cient estimation, especially those based on the lasso of Tibshirani (1996), have received a great deal of attention in recentExpand
Inference with normal-gamma prior distributions in regression problems
This paper considers the efiects of placing an absolutely continuous prior distribution on the regression coe-cients of a linear model. We show that the posterior expectation is a matrix-shrunkenExpand
Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
Variable selection is fundamental to high-dimensional statistical modeling, including nonparametric regression. Many approaches in use are stepwise selection procedures, which can be computationallyExpand
Model uncertainty and variable selection in Bayesian lasso regression
  • Chris Hans
  • Mathematics, Computer Science
  • Stat. Comput.
  • 2010
This paper describes how the marginal likelihood can be accurately computed when the number of predictors in the model is not too large, allowing for model space enumeration when the total number of possible predictors is modest. Expand
The properties of the maximum a posteriori estimator are investigated, as sparse estimation plays an important role in many problems, connections with some well-established regularization procedures are revealed, and some asymptotic results are shown. Expand
On Bayesian lasso variable selection and the specification of the shrinkage parameter
A Bayesian implementation of the lasso regression that accomplishes both shrinkage and variable selection through Bayes factors that evaluate the inclusion of each covariate in the model formulation is proposed. Expand
Sparsity and smoothness via the fused lasso
Summary. The lasso penalizes a least squares regression by the sum of the absolute values (L1-norm) of the coefficients. The form of this penalty encourages sparse solutions (with many coefficientsExpand