Jerome H. Friedman

Learn More
Boosting Freund Schapire Schapire Singer is one of the most important recent developments in classi cation method ology The performance of many classi cation algorithms often can be dramatically improved by sequentially applying them to reweighted versions of the input data and taking a weighted majority vote of the sequence of classi ers thereby produced(More)
We consider the problem of estimating sparse graphs by a lasso penalty applied to the inverse covariance matrix. Using a coordinate descent procedure for the lasso, we develop a simple algorithm--the graphical lasso--that is remarkably fast: It solves a 1000-node problem ( approximately 500,000 parameters) in at most a minute and is 30-4000 times faster(More)
We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, two-class logistic regression, and multinomial regression problems while the penalties include ℓ(1) (the lasso), ℓ(2) (ridge regression) and mixtures of the two (the elastic net). The algorithms use cyclical coordinate descent,(More)
We consider “one-at-a-time” coordinate-wise descent algorithms for a class of convex optimization problems. An algorithm of this kind has been proposed for the L1-penalized regression (lasso) in the lterature, but it seems to have been largely ignored. Indeed, it seems that coordinate-wise algorithms are not often used in convex optimization. We show that(More)
An algorithm and data structure are presented for searching a file containing N records, each described by k real valued keys, for the m closest matches or nearest neighbors to a given query record. The computation required to organize the file is proportional to kN log N. The expected number of records examined in each search is independent of the file(More)
An algorithm for the analysis of multivariate data is presented, and discussed in terms of specific examples. The algorithm seeks to find oneand two-dimensional linear projections of multivariate data that are relatively highly revealing. *Supported by the U.S. Atomic Energy Commission under Contract AT(@+3)515. **Prepared in part in connection with(More)
We consider the group lasso penalty for the linear model. We note that the standard algorithm for solving the problem assumes that the model matrices in each group are orthonormal. Here we consider a more general penalty that blends the lasso (L1) with the group lasso (“two-norm”). This penalty yields solutions that are sparse at both the group and(More)
Lazy learning algorithms, exemplified by nearestneighbor algorithms, do not induce a concise hypothesis from a given training set; the inductive process is delayed until a test instance is given. Algorithms for constructing decision trees, such as C4.5, ID3, and CART create a single “best” decision tree during the training phase, and this tree is then used(More)
We consider rules for discarding predictors in lasso regression and related problems, for computational efficiency. El Ghaoui and his colleagues have propose 'SAFE' rules, based on univariate inner products between each predictor and the outcome, which guarantee that a coefficient will be 0 in the solution vector. This provides a reduction in the number of(More)