Corpus ID: 1263369

Fifty Years of Classification and Regression Trees 1

  title={Fifty Years of Classification and Regression Trees 1},
  author={Wei-Yin Loh},
Fifty years have passed since the publication of the first regression tree algorithm. New techniques have added capabilities that far surpass those of the early methods. Modern classification trees can partition the data with linear splits on subsets of variables and fit nearest neighbor, kernel density, and other models in the partitions. Regression trees can fit almost every kind of traditional statistical model, including least-squares, quantile, logistic, Poisson, and proportional hazards… Expand

Figures and Tables from this paper

Contrast trees and distribution boosting
  • J. Friedman
  • Mathematics, Computer Science
  • Proceedings of the National Academy of Sciences
  • 2020
Contrast trees represent an approach for assessing the accuracy of many types of machine-learning estimates that are not amenable to standard validation methods and can be used as diagnostic tools to reveal and then understand the inaccuracies of models produced by any learning method. Expand
Software defect prediction model based on improved LLE-SVM
A new software defect prediction model based on the improved Locally Linear Embedding and Support Vector Machines (ILLE-SVM) is proposed, which can search the optimal parameters faster than LLE- SVM model and perform better than Lle-S VM in software defects prediction. Expand
Machine learning-based design features decision support tool via customers purchasing data analysis
A machine learning-based design features decision support tool is proposed through big sales data analysis and physical feasibility of the product features combinations is considered for customers preference analysis. Expand
Machine Learning for Anomaly Detection in IoT networks: Malware analysis on the IoT-23 Data set
The Internet of Things is one of the newer developments in the domain of the Internet. It is defined as a network of connected devices and sensors, both physical and digital, that generate andExpand


An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests.
The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high-dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application. Expand
Log-normal regression modeling through recursive partitioning
Abstract This article discusses a method for fitting log-normal regression models to censored survival data through binary decision trees. Recursive partitioning is performed by analysis of theExpand
Log-gamma regression modeling 'through regression trees
This paper investigates a method for fitting piecewise log-gamma regression models to censored survival data. Partitioning is performed using analysis of the distributions of residuals andExpand
Classification and regression trees
  • W. Loh
  • Computer Science
  • Wiley Interdiscip. Rev. Data Min. Knowl. Discov.
  • 2011
This article gives an introduction to the subject of classification and regression trees by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples. Expand
Functional Models for Regression Tree Leaves
  • L. Torgo
  • Mathematics, Computer Science
  • ICML
  • 1997
This study indicates that by integrating regression trees with other regression approaches the authors are able to overcome the limitations of individual methods both in terms of accuracy as well as in computational efficiency. Expand
LOTUS: An Algorithm for Building Accurate and Comprehensible Logistic Regression Trees
Logistic regression is a powerful technique for fitting models to data with a binary response variable, but the models are difficult to interpret if collinearity, nonlinearity, or interactions areExpand
A Bias Correction Algorithm for the Gini Variable Importance Measure in Classification Trees
This article considers a measure of variable importance frequently used in variable-selection methods based on decision trees and tree-based ensemble models. These models include CART, randomExpand
Optimal Partitioning for Classification and Regression Trees
  • P. Chou
  • Mathematics, Computer Science
  • IEEE Trans. Pattern Anal. Mach. Intell.
  • 1991
An iterative algorithm that finds a locally optimal partition for an arbitrary loss function, in time linear in N for each iteration, is presented and it is proven that the globally optimal partition must satisfy a nearest neighbour condition using divergence as the distance measure. Expand
Median Regression Tree for Analysis of Censored Survival Data
  • Hyung Jun Cho, Seung-Mo Hong
  • Mathematics, Computer Science
  • IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans
  • 2008
A new, unique, and efficient algorithm for tree- Structured median regression modeling that combines the merits of both a median regression model and a tree-structured model is proposed and discussed. Expand
Tree-Structured Methods for Longitudinal Data
Abstract The thrust of tree techniques is the extraction of meaningful subgroups characterized by common covariate values and homogeneous outcome. For longitudinal data, this homogeneity can pertainExpand