# Estimation of distribution algorithms as logistic regression regularizers of microarray classifiers.

@article{Bielza2009EstimationOD, title={Estimation of distribution algorithms as logistic regression regularizers of microarray classifiers.}, author={Concha Bielza and V{\'i}ctor Robles and Pedro Larra{\~n}aga}, journal={Methods of information in medicine}, year={2009}, volume={48 3}, pages={ 236-41 } }

OBJECTIVES
The "large k (genes), small N (samples)" phenomenon complicates the problem of microarray classification with logistic regression. The indeterminacy of the maximum likelihood solutions, multicollinearity of predictor variables and data over-fitting cause unstable parameter estimates. Moreover, computational problems arise due to the large number of predictor (genes) variables. Regularized logistic regression excels as a solution. However, the difficulties found here involve an…

## 15 Citations

Chapter 6 Estimation of Distribution Algorithms in Gene Expression Data Analysis

- Computer Science
- 2011

This chapter provides an overview of different existing EDAs and then review some of their application in bioinformatics and finally it discusses a specific problem that have been solved with this method in more details.

egularized continuous estimation of distribution algorithms

- Computer Science
- 2013

The results show that the optimization performance of the proposed RegEDAs is less affected by the increase in the problem size than other EDAs, and they are able to obtain significantly better optimization values for many of the functions in high-dimensional igh-dimensionality settings.

Estimation of Distribution Algorithms in Gene Expression Data Analysis

- Computer Science
- 2012

This chapter provides an overview of different existing EDAs and then review some of their application in bioinformatics and finally it discusses a specific problem that have been solved with this method in more details.

A review of estimation of distribution algorithms in bioinformatics

- Computer ScienceBioData Mining
- 2008

A basic taxonomy of EDA techniques is set out, underlining the nature and complexity of the probabilistic model of each EDA variant, and emphasizing the EDA paradigm's potential for further research in this domain.

Scaling Up Estimation of Distribution Algorithms for Continuous Optimization

- Computer ScienceIEEE Transactions on Evolutionary Computation
- 2013

EDA-MCC is the first successful instance of multivariate probabilistic model-based EDAs that can be effectively applied to a general class of up to 500-D problems and outperforms some newly developed algorithms designed specifically for large-scale optimization.

A latent space-based estimation of distribution algorithm for large-scale global optimization

- Computer ScienceSoft Comput.
- 2019

A latent space-based EDA (LS-EDA), which transforms the multivariate probabilistic model of Gaussian- based EDA into its principal component latent subspace with lower dimensionality, and outperforms the others on the benchmark functions with overlap and nonseparate variables.

Identification of biomarkers that distinguish chemical contaminants based on gene expression profiles

- BiologyBMC Genomics
- 2013

A new feature selection algorithm called gradient method was developed that had a relatively high training classification as well as prediction accuracy with the lowest overfitting rate of the methods tested.

Random mask-based estimation of the distribution algorithm for stacked auto-encoder one-step pre-training

- Computer ScienceComput. Ind. Eng.
- 2021

On novel approaches for classification. A proposal for an interdisciplinary debate.

- Computer ScienceMethods of information in medicine
- 2010

Standard statistics can be used to judge whether a novel classification scheme performs significantly better than the standard classifier, and if two different classification schemes are applied to the same data set, each subject can be judged to be correctly classified by each of the two classifiers.

Biomedical Data Mining

- Computer Science, MedicineMethods of Information in Medicine
- 2009

The special topic of Methods of Information in Medicine on data mining in biomedicine is introduced, with selected papers from two workshops on Intelligent Data Analysis in bioMedicine (IDAMAP) held in Verona and Amsterdam.

## References

SHOWING 1-10 OF 42 REFERENCES

Classification of microarray data with penalized logistic regression

- MathematicsSPIE BiOS
- 2001

penalized logistic regression performs well on a public data set (the MIT ALL/AML data) and is optimized with AIC (Akaike's Information Criterion), which essentially is a measure of prediction performance.

Classification using partial least squares with penalized logistic regression

- Computer ScienceBioinform.
- 2005

A new method combining partial least squares (PLS) and Ridge penalized logistic regression is proposed and the predictive performance of the resulting classification rule is illustrated on three data sets: Leukemia, Colon and Prostate.

Gene selection in cancer classification using sparse logistic regression with Bayesian regularization

- Computer ScienceBioinform.
- 2006

A simple Bayesian approach can be taken to eliminate this regularization parameter entirely, by integrating it out analytically using an uninformative Jeffrey's prior, and the improved algorithm (BLogReg) is then typically two or three orders of magnitude faster than the original algorithm, as there is no longer a need for a model selection step.

Optimizing logistic regression coefficients for discrimination and calibration using estimation of distribution algorithms

- Computer Science
- 2008

This work presents a novel approach for fitting the logistic regression model based on estimation of distribution algorithms (EDAs), a tool for evolutionary computation from a double perspective: likelihood- based to calibrate the model and AUC-based to discriminate between the different classes.

Classification of gene microarrays by penalized logistic regression.

- Computer ScienceBiostatistics
- 2004

Classification of patient samples is an important aspect of cancer diagnosis and treatment. The support vector machine (SVM) has been successfully applied to microarray cancer diagnosis problems.…

Multivariate selection of genetic markers in diagnostic classification

- Computer ScienceArtif. Intell. Medicine
- 2004

Entropy-based gene ranking without selection bias for the predictive classification of microarray data

- Computer ScienceBMC Bioinformatics
- 2003

A process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance as well as improving on alternative parametric RFE reduction strategies.

Regularized ROC method for disease classification and biomarker selection with microarray data

- Computer ScienceBioinform.
- 2005

The proposed method uses a sigmoid approximation to the area under the ROC curve as the objective function for classification and the threshold gradient descent regularization method for estimation and biomarker selection and yields parsimonious models with excellent classification performance.

Sparse multinomial logistic regression: fast algorithms and generalization bounds

- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2005

This paper introduces a true multiclass formulation based on multinomial logistic regression and derives fast exact algorithms for learning sparse multiclass classifiers that scale favorably in both the number of training samples and the feature dimensionality, making them applicable even to large data sets in high-dimensional feature spaces.

An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression

- Computer ScienceJ. Mach. Learn. Res.
- 2007

This paper describes an efficient interior-point method for solving large-scale l1-regularized logistic regression problems, and shows how a good approximation of the entire regularization path can be computed much more efficiently than by solving a family of problems independently.