# Under-bagging Nearest Neighbors for Imbalanced Classification

@article{Hang2021UnderbaggingNN, title={Under-bagging Nearest Neighbors for Imbalanced Classification}, author={Hanyuan Hang and Yuchao Cai and Hanfang Yang and Zhouchen Lin}, journal={ArXiv}, year={2021}, volume={abs/2109.00531} }

In this paper, we propose an ensemble learning algorithm called under-bagging k-nearest neighbors (under-bagging k-NN ) for imbalanced classification problems. On the theoretical side, by developing a new learning theory analysis, we show that with properly chosen parameters, i.e., the number of nearest neighbors k, the expected sub-sample size s, and the bagging rounds B, optimal convergence rates for under-bagging k-NN can be achieved under mild assumptions w.r.t. the arithmetic mean (AM) of…

## References

SHOWING 1-10 OF 99 REFERENCES

KRNN: k Rare-class Nearest Neighbour classification

- Computer SciencePattern Recognit.
- 2017

An algorithm k Rare-class Nearest Neighbour, or KRNN is proposed, by directly adjusting the induction bias of KNN to form dynamic query neighbourhoods, and to further adjust the positive posterior probability estimation to bias classification towards the rare class.

Properties of bagged nearest neighbour classifiers

- Mathematics
- 2005

It is shown that bagging, a computationally intensive method, asymptotically improves the performance of nearest neighbour classifiers provided that the resample size is less than 69% of the actual…

A novel density-based adaptive k nearest neighbor method for dealing with overlapping problem in imbalanced datasets

- Computer ScienceNeural Computing and Applications
- 2020

A density-based adaptive k nearest neighbor method, namely DBANN, which can handle imbalanced and overlapping problems simultaneously and significantly outperforms the state-of-the-art methods is proposed.

KNN-Based Overlapping Samples Filter Approach for Classification of Imbalanced Data

- Computer ScienceSoftware Engineering Research, Management and Applications
- 2019

Experimental results indicate that the proposed under-sampling method can effectively improve the five representative algorithms in terms of three popular metrics; area under the curve (AUC), G-mean and F-measure.

Compressed kNN: K-Nearest Neighbors with Data Compression

- Computer Science, MedicineEntropy
- 2019

This paper presents a variation of the kNN algorithm, of the type structure less NN, to work with categorical data, which allows us to maintain the whole dataset in memory which leads to a considerable reduction of the amount of memory required.

Class Based Weighted K-Nearest Neighbor over Imbalance Dataset

- Computer SciencePAKDD
- 2013

A modified version of kNN algorithm is proposed so that it takes into account the class distribution in a wider region around the query instance, and outperforms current state-of-the-art approaches.

An Effective Evidence Theory Based K-Nearest Neighbor (KNN) Classification

- Computer Science2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology
- 2008

This paper studies various K nearest neighbor (KNN) algorithms and presents a new KNN algorithm based on evidence theory that outperforms other KNN algorithms, including basic evidence based KNN.

On the Statistical Consistency of Algorithms for Binary Classification under Class Imbalance

- Mathematics, Computer ScienceICML
- 2013

This paper studies consistency with respect to one performance measure, namely the arithmetic mean of the true positive and true negative rates (AM), and establishes that some practically popular approaches, such as applying an empirically determined threshold to a suitable class probability estimate or performing an empirical balanced form of risk minimization, are in fact consistent withrespect to the AM.

Improving k Nearest Neighbor with Exemplar Generalization for Imbalanced Classification

- Mathematics, Computer SciencePAKDD
- 2011

This work proposes to identify exemplar minority class training instances and generalize them to Gaussian balls as concepts for the minority class to improve the performance of kNN and also outperforms popular re-sampling and costsensitive learning strategies for imbalanced classification.

On the Rate of Convergence of the Bagged Nearest Neighbor Estimate

- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2010

Bagging is a simple way to combine estimates in order to improve their performance, and it is shown that this estimate may achieve optimal rate of convergence, independently from the fact that resampling is done with or without replacement.