SMOTE for high-dimensional class-imbalanced data

@inproceedings{Blagus2012SMOTEFH,
  title={SMOTE for high-dimensional class-imbalanced data},
  author={Rok Blagus and Lara Lusa},
  booktitle={BMC Bioinformatics},
  year={2012}
}
BackgroundClassification using class-imbalanced data is biased in favor of the majority class. The bias is even larger for high-dimensional data, where the number of variables greatly exceeds the number of samples. The problem can be attenuated by undersampling or oversampling, which produce class-balanced data. Generally undersampling is helpful, while random oversampling is not. Synthetic Minority Oversampling TEchnique (SMOTE) is a very popular oversampling method that was proposed to… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-10 OF 110 CITATIONS, ESTIMATED 99% COVERAGE

SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary

VIEW 5 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Application of Data Mining Technology on Surveillance Report Data of HIV/AIDS High-Risk Group in Urumqi from 2009 to 2015

VIEW 9 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

Membrane Protein Type Prediction for High-Dimensional Imbalanced Datasets

  • Lei Guo, Shunfang Wang
  • Computer Science
  • 2018 9th International Conference on Information Technology in Medicine and Education (ITME)
  • 2018
VIEW 10 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

Pattern recognition for water flooded layer based on ensemble classifier

  • Zhiqiang Geng, Xuan Hu, +3 authors Yanlin He
  • Computer Science
  • 2018 5th International Conference on Control, Decision and Information Technologies (CoDIT)
  • 2018
VIEW 5 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

Automated classification of adverse events in pharmacovigilance

VIEW 5 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

Development of New Bioinformatic Approaches for Human Genetic Studies

VIEW 15 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

An imbalanced data classification method based on automatic clustering under-sampling

VIEW 7 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

A Hybrid Sampling Method Based on Safe Screening for Imbalanced Datasets with Sparse Structure

VIEW 4 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

FILTER CITATIONS BY YEAR

2013
2019

CITATION STATISTICS

  • 10 Highly Influenced Citations

  • Averaged 26 Citations per year from 2017 through 2019

  • 9% Increase in citations per year in 2019 over 2018

References

Publications referenced by this paper.
SHOWING 1-10 OF 44 REFERENCES

Class prediction for high-dimensional class-imbalanced data

VIEW 9 EXCERPTS

Learning from Imbalanced Data

VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

Design and Analysis of DNA Microarray Investigations

  • RM Simon, EL Korn, +3 authors Y Zhao
  • 2004
VIEW 14 EXCERPTS
HIGHLY INFLUENTIAL

Statistical Analysis of Gene Expression Microarray Data

  • TP Speed
  • Boca Raton: Chapman
  • 2003
VIEW 14 EXCERPTS
HIGHLY INFLUENTIAL

SMOTE: Synthetic Minority Over-sampling Technique

VIEW 4 EXCERPTS
HIGHLY INFLUENTIAL

Class Imbalance, Redux

VIEW 3 EXCERPTS
HIGHLY INFLUENTIAL

Species distribution modeling and prediction: A class imbalance problem

VIEW 1 EXCERPT