Learn More
The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this texts use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors study of tree methods. Classification and(More)
Linear and quadratic discriminant analysis are considered in the small sample high-dimensional setting. Alternatives to the usual maximum likelihood (plug-in) estimates for the covariance matrices are proposed. These alternatives are characterized by two parameters, the values of which are customized to individual situations by jointly minimizing a sample(More)
Gradient boosting constructs additive regression models by sequentially tting a simple parameterized function (base learner) to current \pseudo"{residuals by least{squares at each iteration. The pseudo{residuals are the gradient of the loss functional being minimized, with respect to the model values at each training data point, evaluated at the current(More)
For high dimensional supervised learning problems, often using problem specific assumptions can lead to greater accuracy. For problems with grouped covariates, which are believed to have sparse effects both on a group and within group level, we introduce a regularized model for linear regression with 1 and 2 penalties. We discuss the sparsity and other(More)
Regularization in linear regression and classi…cation is viewed as a two–stage process. First a set of candidate models is de…ned by a path through the space of joint parameter values, and then a point on this path is chosen to be the …nal model. Various path…nding strategies for the …rst stage of this process are examined, based on the notion of(More)
Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Each copy of any part of a JSTOR transmission must contain the same copyright notice that(More)
In regression analysis the response variable Y and the predictor variables XI,. . , Xp are often replaced by functions 0(Y) and 4I(XI),. .. p, (Xp). We discuss a procedure for estimating those functions 0* and 4 *. .. * that minimize e2 = E{[0(Y) _ lp=, 0j(Xj)]2}/var[0(Y)], given only a sample {(Yk, Xkl,. . ., Xkp), 1 ' k ? N} and making minimal assumptions(More)
>IJH=?J The K-nearest-neighbor decision rule assigns an object of unknown class to the plurality class among the K labeled \training" objects that are closest to it. Close-ness is usually de¯ned in terms of a metric distance on the Euclidean space with the input measurement variables as axes. The metric chosen to de¯ne this distance can strongly e®ect(More)
A new procedure is proposed for clustering attribute value data. When used in conjunction with conventional distance-based clustering algorithms this procedure encourages those algorithms to detect automatically subgroups of objects that preferentially cluster on subsets of the attribute variables rather than on all of them simultaneously. The relevant(More)