Using Random Forest to Learn Imbalanced Data

  title={Using Random Forest to Learn Imbalanced Data},
  author={Chao Chen},
In this paper we propose two ways to deal with the imbalanced data classification problem using random forest. One is based on cost sensitive learning, and the other is based on a sampling technique. Performance metrics such as precision and recall, false positive rate and false negative rate, F-measure and weighted accuracy are computed. Both methods are shown to improve the prediction accuracy of the minority class, and have favorable performance compared to the existing algorithms. 
Highly Influential
This paper has highly influenced 36 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 806 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.


Publications citing this paper.
Showing 1-10 of 353 extracted citations

DPPred: An Effective Prediction Framework with Concise Discriminative Patterns

IEEE Transactions on Knowledge and Data Engineering • 2018
View 6 Excerpts
Highly Influenced

Discriminative Sparse Neighbor Approximation for Imbalanced Learning

IEEE Transactions on Neural Networks and Learning Systems • 2018
View 11 Excerpts
Highly Influenced

Predicting Attrition in Financial Data with Machine Learning Algorithms

View 4 Excerpts
Highly Influenced

Multi-class and feature selection extensions of Roughly Balanced Bagging for imbalanced data

Journal of Intelligent Information Systems • 2017
View 4 Excerpts
Highly Influenced

Use of ECDF-based features and ensemble of classifiers to accurately detect mobility activities of people using accelerometers

2017 9th International Conference on Communication Systems and Networks (COMSNETS) • 2017
View 6 Excerpts
Highly Influenced

806 Citations

Citations per Year
Semantic Scholar estimates that this publication has 806 citations based on the available data.

See our FAQ for additional information.

Similar Papers

Loading similar papers…