# Learning Decision Trees from Histogram Data

@inproceedings{Gurung2015LearningDT, title={Learning Decision Trees from Histogram Data}, author={Ram B. Gurung and Tony Lindgren and Henrik Bostr{\"o}m}, booktitle={DMIN 2015}, year={2015} }

When applying learning algorithms to histogram data, bins of such variables are normally treated as separate independent variables. However, this may lead to a loss of information as the underlying ...

#### Figures, Tables, and Topics from this paper

#### 5 Citations

Learning Random Forest from Histogram Data Using Split Specific Axis Rotation

- Computer Science
- 2018

An adapted version of the random forest algorithm is proposed to be applied to data containing histogram variables and it is shown that this algorithm can be used to solve the classification problem of histogram variable replacement. Expand

Learning Decision Trees and Random Forests from Histogram Data : An application to component failure prediction for heavy duty trucks

- Computer Science
- 2017

A large volume of data has become commonplace in many domains these days. Machine learning algorithms can be trained to look for any useful hidden patterns in such data. Sometimes, these big data m… Expand

Learning Decision Trees from Histogram Data Using Multiple Subsets of Bins

- Computer Science
- FLAIRS Conference
- 2016

A sliding window approach to select subsets of the bins to be considered simultaneously while partitioning examples significantly reduces the number of possible splits to consider, allowing for substantially larger histograms to be handled. Expand

Predicting NOx sensor failure in heavy duty trucks using histogram-based random forests

- Computer Science
- 2020

Being able to accurately predict the impending failures of truck components is often associated with significant amount of cost savings, customer satisfaction and flexibility in maintenance service… Expand

Planning Flexible Maintenance for Heavy Trucks using Machine Learning Models, Constraint Programming, and Route Optimization

- Engineering, Computer Science
- 2017

Maintenance planning of trucks at Scania have previously been done using static cyclic plans with fixed sets of maintenance tasks, determined by mileage, calendar time, and some data driven physica… Expand

#### References

SHOWING 1-10 OF 13 REFERENCES

Classification and Regression Trees

- Mathematics, Computer Science
- 1983

This chapter discusses tree classification in the context of medicine, where right Sized Trees and Honest Estimates are considered and Bayes Rules and Partitions are used as guides to optimal pruning. Expand

Classification and multivariate analysis for complex data structures

- Computer Science
- 2011

This paper presents a meta-analysis of data mining, classification and discrimination in terms of categorical data and Latent Class approach, and the results show clear trends in both spatial and temporal data mining and classification. Expand

An Incremental Method for Finding Multivariate Splits for Decision Trees

- Mathematics, Computer Science
- ML
- 1990

The PT2 algorithm, which searches for a multivariate split at each node, is presented, which is incremental, handles ordered and unordered variables, and estimates missing values. Expand

Principal Component Analysis for Categorical Histogram Data: Some Open Directions of Research

- Mathematics
- 2011

In recent years, the analysis of symbolic data where the units are categories, classes or concepts described by interval, distributions, sets of categories and the like becomes a challenging task… Expand

A New Wasserstein Based Distance for the Hierarchical Clustering of Histogram Symbolic Data

- Computer Science
- Data Science and Classification
- 2006

A new distance is presented, based on the Wasserstein metric, in order to cluster a set of data described by distributions with finite continue support, or, as called in SDA, by “histograms”, a measure of inertia of data with respect to a barycenter that satisfies the Huygens theorem of decomposition of inertia. Expand

Distribution and Symmetric Distribution Regression Model for Histogram-Valued Variables

- Mathematics
- 2013

Histogram-valued variables are a particular kind of variables studied in Symbolic Data Analysis where to each entity under analysis corresponds a distribution that may be represented by a histogram… Expand

Histogram PCA

- Computer Science
- ISNN
- 2007

An important attempt to analyze a symbolic data set for dimensionality reduction when the features are of histogram type and proposes basic arithmetic and definitions related to histogram data. Expand

F eb 2 01 2 Linear regression for numeric symbolic variables : an ordinary least squares approach based on Wasserstein Distance

In this paper we present a linear regression model for modal symbolic data. The observed variables are histogram variables according to the definition given in Bock and Diday [1] and the parameters… Expand

Design of multicategory multifeature split decision trees using perceptron learning

- Computer Science
- Pattern Recognit.
- 1994

A new top-down decision tree design method is presented that generates compact trees of superior performance by using multifeature splits in place of single feature splits at successive stages of the tree development. Expand

Symbolic Data Analysis: Conceptual Statistics and Data Mining (Wiley Series in Computational Statistics)

- Computer Science, Mathematics
- 2007

This chapter discusses Descriptive Statistics: Two or More Variates, which focuses on the part of the model concerned with Hierarchy-Divisive Clustering and Cluster Analysis. Expand