• Corpus ID: 41727607

Paper in Business Analytics Feature Selection using LASSO

  title={Paper in Business Analytics Feature Selection using LASSO},
  author={Valeria Francesca Fonti and Eduard N. Belitser},
Which are the most relevant attributes to describe a response variable? This is one of the first question a researcher need to ask himself while analyzing a dataset, and the answer is not trivial. This research paper aims to explain and discuss the use of the LASSO method to address the feature selection task. Feature selection is a crucial and challenging task in the statistical modeling field, there are many studies that try to optimize and standardize this process for any kind of data, but… 

Figures and Tables from this paper

Evaluating Feature Selection Methods for Short-Term Load Forecasting

Test results show that all feature selection methods could identify a custom-made subset of highly relevant features for each household, and building predictive models utilizing feature selection techniques led to considerable improvements in training speed and simplicity, as well as comparable prediction accuracy with models without feature engineering.

Author response to Cunha et al

In this response to the Letter to the Editor by Cunha et al, the reasons behind choosing LASSO and XGBoost with LOOCV with Leave-One-Out Cross-Validation as the feature selection and classifier method for radiomics models are explained and discussed.

Simulated annealing for symbolic regression

Simulated Annealing exhibits an intrinsic ability to escape from poor local minima, which is demonstrated here to yield competitive results, when compared with state-of-the-art Symbolic Regression techniques, that depend on population-based meta-heuristics, and committees of learning machines.

Machine Learning in Spatial Study of Asthma Rate Distribution in Los Angeles County

Public health studies have revealed numerous links between certain factors and asthma. Due to various causes of asthma attack and different reactions of people, it is necessary to construct a

Correlating drug prescriptions with prognosis in severe COVID-19: first step towards resource management

An important result in the area of Artificial Intelligence is achieved, as it is able to establish a correlation between concrete variables in a real and extremely complex environment of clinical data from COVID-19.

The Impact of Gray-Listing on Capital Flows: An Analysis Using Machine Learning

The Financial Action Task Force’s gray list publicly identifies countries with strategic deficiencies in their AML/CFT regimes (i.e., in their policies to prevent money laundering and the financing of

Exploratory Analysis of COVID-19 Patients Usingprincipal Component Analysis, Feature Selection and Predictive Algorithms

A classifier was developed that achieved 76% mean accuracy, 77% mean precision and 92% mean sensitivity to identify individuals with COVID-19, and found a set of 18 variables that showed some association with a positive PCR result.

analysis of COVID-19 patients using principal component analysis, feature selection and predictive algorithms

A classifier was developed that achieved 76% mean accuracy, 77% mean precision and 92% mean sensitivity to identify individuals with COVID-19, a novel coronavirus disease that emerged in late 2019.

Prediction of long-term hospitalisation and all-cause mortality in patients with chronic heart failure on Dutch claims data: a machine learning approach

Background Accurately predicting which patients with chronic heart failure (CHF) are particularly vulnerable for adverse outcomes is of crucial importance to support clinical decision making. The

Feature Selection and Negative Binomial Regression for Predicting Number of Defects in Wire Mesh Production

In wire mesh production, many types of defects are found. When the factors related to the number of defects occurring are correctly identified, various improvement methods can then be applied to



Regression Shrinkage and Selection via the Lasso

A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

Regularization and variable selection via the elastic net

It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.

The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition

This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression and path algorithms for the lasso, non-negative matrix factorization, and spectral clustering.

Linear models in statistics

Preface. 1. Introduction. 2. Matrix Algebra. 3. Random Vectors and Matrices. 4. Multivariate Normal Distribution. 5. Distribution of Quadratic Forms in y. 6. Simple Linear Regression. 7. Multiple

Extended BIC for smallnlargeP sparse GLM

  • An Analysis of Feature Selection Techniques
  • 2013