Learn More
The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this texts use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors study of tree methods. Classification and(More)
Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree(More)
Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as(More)
Recent work has shown that combining multiple versions of unstable classifiers such as trees or neural nets results in reduced test set error. To study this, the concepts of bias and variance of a classifier are defined. Unstable classifiers can have universally low bias. Their problem is high variance. Combining multiple versions is a variance reducing(More)
Stacking regressions is a method for forming linear combinations of different predictors to give improved prediction accuracy. The idea is to use cross-validation data and least squares under non negativity constraints to determine the coefficients in the combination. Its effectiveness is demonstrated in stacking regression trees of different sizes and in a(More)
Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact(More)
There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to(More)
Bagging predictors is a method for generating multiple versions of a pre-dictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality v ote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these(More)
The theory behind the success of adaptive reweighting and combining algorithms (arcing) such as Adaboost (Freund and Schapire [1995, 1996a]) and others in reducing generalization error has not been well understood. By formulating prediction as a game where one player makes a selection from instances in the training set and the other a convex linear(More)
In regression analysis the response variable Y and the predictor variables XI,. . , Xp are often replaced by functions 0(Y) and 4I(XI),. .. p, (Xp). We discuss a procedure for estimating those functions 0* and 4 *. .. * that minimize e2 = E{[0(Y) _ lp=, 0j(Xj)]2}/var[0(Y)], given only a sample {(Yk, Xkl,. . ., Xkp), 1 ' k ? N} and making minimal assumptions(More)