Generalizing Gain Penalization for Feature Selection in Tree-Based Models
@article{Wundervald2020GeneralizingGP, title={Generalizing Gain Penalization for Feature Selection in Tree-Based Models}, author={Bruna D. Wundervald and Andrew C. Parnell and Katarina Domijan}, journal={IEEE Access}, year={2020}, volume={8}, pages={190231-190239} }
We develop a new approach for feature selection via gain penalization in tree-based models. First, we show that previous methods do not perform sufficient regularization and often exhibit sub-optimal out-of-sample performance, especially when correlated features are present. Instead, we develop a new gain penalization idea that exhibits a general local-global regularization for tree-based models. The new method allows for full flexibility in the choice of feature-specific importance weights…
3 Citations
Fecal steroids as a potential tool for conservation paleobiology in East Africa
- Environmental Science, GeographyBiodiversity and Conservation
- 2021
Conservation paleobiology seeks to leverage proxy reconstructions of ecological communities and environmental conditions to predict future changes and inform management decisions. Populations of East…
Fecal steroids as a potential tool for conservation paleobiology in East Africa
- Environmental Science, GeographyBiodiversity and Conservation
- 2021
Conservation paleobiology seeks to leverage proxy reconstructions of ecological communities and environmental conditions to predict future changes and inform management decisions. Populations of East…
The Role of Intelligent Technologies in Early Detection of Autism Spectrum Disorder (ASD): A Scoping Review
- PsychologyIEEE Access
- 2022
Background: Two-year delay is reported between the first developmental concern raised by the parents and the diagnosis of ASD (Autism Spectrum Disorder), delaying the start of early intervention…
References
SHOWING 1-10 OF 55 REFERENCES
Feature selection via regularized trees
- Computer ScienceThe 2012 International Joint Conference on Neural Networks (IJCNN)
- 2012
A tree regularization framework is proposed, which enables many tree models to perform feature selection efficiently and provides an effective and efficient feature selection solution for many practical problems.
Regression Shrinkage and Selection via the Lasso
- Computer Science
- 1996
A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Learning Nonlinear Functions Using Regularized Greedy Forest
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2014
This paper proposes a method that directly learns decision forests via fully-corrective regularized greedy search using the underlying forest structure and achieves higher accuracy and smaller models than gradient boosting on many of the datasets it has tested on.
Pruning Random Forests for Prediction on a Budget
- Computer ScienceNIPS
- 2016
This work poses pruning RFs as a novel 0-1 integer program with linear constraints that encourages feature re-use and establishes total unimodularity of the constraint set to prove that the corresponding LP relaxation solves the original integer program.
Bagging predictors
- Computer ScienceMachine Learning
- 2004
Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.
Bayesian Additive Regression Trees
- Computer Science
- 2006
We develop a Bayesian \sum-of-trees" model where each tree is constrained by a regularization prior to be a weak learner, and fltting and inference are accomplished via an iterative Bayesian…
Practical Bayesian Optimization of Machine Learning Algorithms
- Computer ScienceNIPS
- 2012
This work describes new algorithms that take into account the variable cost of learning algorithm experiments and that can leverage the presence of multiple cores for parallel experimentation and shows that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms.
BART: Bayesian Additive Regression Trees
- Computer Science
- 2010
We develop a Bayesian "sum-of-trees" model where each tree is constrained by a regularization prior to be a weak learner, and fitting and inference are accomplished via an iterative Bayesian…
Generalized random forests
- Computer Science, MathematicsThe Annals of Statistics
- 2019
A flexible, computationally efficient algorithm for growing generalized random forests, an adaptive weighting function derived from a forest designed to express heterogeneity in the specified quantity of interest, and an estimator for their asymptotic variance that enables valid confidence intervals are proposed.