• Corpus ID: 219179362

Feature-weighted elastic net: using "features of features" for better prediction

  title={Feature-weighted elastic net: using "features of features" for better prediction},
  author={Jingyi K. Tay and Nima Aghaeepour and Trevor J. Hastie and Robert Tibshirani},
In some supervised learning settings, the practitioner might have additional information on the features used for prediction. We propose a new method which leverages this additional information for better prediction. The method, which we call the feature-weighted elastic net ("fwelnet"), uses these "features of features" to adapt the relative penalties on the feature coefficients in the elastic net penalty. In our simulations, fwelnet outperforms the lasso in terms of test mean squared error… 
Fast marginal likelihood estimation of penalties for group-adaptive elastic net
Nowadays, clinical research routinely uses omics data, such as gene expression, for predicting clinical outcomes or selecting markers. Additionally, so-called co-data are often available, providing
Group-regularized ridge regression via empirical Bayes noise level cross-validation.
Features in predictive models are not exchangeable, yet common supervised models treat them as such. Here we study ridge regression when the analyst can partition the features into $K$ groups based


Better prediction by use of co-data: adaptive group-regularized ridge regression.
It is shown that the group-specific penalties may lead to a larger distinction between 'near-zero' and relatively large regression parameters, which facilitates post hoc variable selection and improves the predictive performances of ordinary logistic ridge regression and the group lasso.
Adaptive penalization in high-dimensional regression and classification with external covariates using variational Bayes
This work presents a method that differentially penalizes feature groups defined by the covariates and adapts the relative strength of penalization to the information content of each group, and extends the range of applications of penalized regression, improves model interpretability and can improve prediction performance.
Feature selection guided by structural information
In generalized linear regression problems with an abundant number of features, lasso-type regularization which imposes an l1-constraint on the regression coefficients has become a widely established
Weighted Lasso with Data Integration
Through simulations, it is shown that the weighted lasso with integrated relevant external information on the covariates outperforms the lasso and the adaptive lasso when the external information is from relevant to partly relevant, in terms of both variable selection and prediction.
Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms
Simulated and real data examples demonstrate that, if prior knowledge on gene grouping is indeed informative, the new methods perform better than the two standard penalized methods, yielding higher predictive accuracy and screening out more irrelevant genes.
Regularization and variable selection via the elastic net
Summary. We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a
Regularising Non-linear Models Using Feature Side-information
This paper proposes a framework that allows for the incorporation of the feature side- information during the learning of very general model families to improve the prediction performance and controls the structures of the learned models so that they reflect features similarities as these are defined on the basis of the side-information.
IPF-LASSO: Integrative L 1-Penalized Regression with Penalty Factors for Prediction Based on Multi-Omics Data
This paper proposes a simple penalized regression method, called IPF-LASSO (Integrative LASSO with Penalty Factors), and is implemented in the R package ipflasso and illustrated through applications to two real-life cancer datasets.
Sparsity and smoothness via the fused lasso
Summary. The lasso penalizes a least squares regression by the sum of the absolute values (L1-norm) of the coefficients. The form of this penalty encourages sparse solutions (with many coefficients
Practical Bayesian Optimization of Machine Learning Algorithms
This work describes new algorithms that take into account the variable cost of learning algorithm experiments and that can leverage the presence of multiple cores for parallel experimentation and shows that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms.