Synthetic Oversampling of Multi-Label Data based on Local Label Distribution
@inproceedings{Liu2019SyntheticOO, title={Synthetic Oversampling of Multi-Label Data based on Local Label Distribution}, author={Bin Liu and Grigorios Tsoumakas}, booktitle={ECML/PKDD}, year={2019} }
Class-imbalance is an inherent characteristic of multi-label data which affects the prediction accuracy of most multi-label learning methods. One efficient strategy to deal with this problem is to employ resampling techniques before training the classifier. Existing multilabel sampling methods alleviate the (global) imbalance of multi-label datasets. However, performance degradation is mainly due to rare subconcepts and overlapping of classes that could be analysed by looking at the local…
14 Citations
Integrating Unsupervised Clustering and Label-specific Oversampling to Tackle Imbalanced Multi-label Data
- Computer ScienceProceedings of the 15th International Conference on Agents and Artificial Intelligence
- 2023
This paper proposes a minority class oversampling scheme, UCLSO, which integrates Unsupervised Clustering and Label-Specific data Oversampling, and shows that the proposed method performed very well compared to the other competing algorithms.
Feature construction and smote-based imbalance handling for multi-label learning
- Computer ScienceInf. Sci.
- 2021
Towards Class-Imbalance Aware Multi-Label Learning
- Computer ScienceIEEE Transactions on Cybernetics
- 2022
A simple yet effective class-imbalance aware learning strategy called cross-coupling aggregation (COCOA) is proposed in this article, which works by leveraging the exploitation of label correlations as well as the exploration of class-IMbalance simultaneously.
Combining multi-label classifiers based on projections of the output space using Evolutionary algorithms
- Computer ScienceKnowl. Based Syst.
- 2020
Joint Learning of Binary Classifiers and Pairwise Label Correlations for Multi-label Image Classification
- Computer Science2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)
- 2020
This paper jointly learning the binary classifiers and pairwise label correlations (JBP) in an end-to-end manner and introduces the strategy of online hard sample mining to focus on distinguishing confusing label pairs.
EvoSplit: An evolutionary approach to split a multi-label data set into disjoint subsets
- Computer ScienceApplied Sciences
- 2021
A single-objective evolutionary approach is introduced that tries to obtain a split that maximizes the similarity between those distributions independently and a new multi-objectives evolutionary algorithm is presented to maximize the similarity considering simultaneously both distributions.
Learning Fairly With Class-Imbalanced Data for Interference Coordination
- Computer ScienceIEEE Transactions on Vehicular Technology
- 2021
A training method to encourage fairness among classes by minimizing the maximal cost of decisions among classes is proposed, which is converted into a problem to optimize the weighting factors on the training cost of each class.
Simultaneous and Spatiotemporal Detection of Different Levels of Activity in Multidimensional Data
- Computer ScienceIEEE Access
- 2020
A new multilabeling technique is introduced, which assigns different labels to different regions of interest in the data, and thus, incorporates the spatial aspect, and its ability in detecting frequent motion patterns based on predicted spatiotemporal activity levels is discussed.
Local Imbalance based Ensemble for Predicting Interactions between Novel Drugs and Targets
- Biology
- 2020
The proposed ensemble approaches consist of several DTI prediction models learned on training subsets which have been defined by different sampling strategies and indicate that the local imbalance-aware sampling strategy is the most effective.
References
SHOWING 1-10 OF 32 REFERENCES
Towards Label Imbalance in Multi-label Classification with Many Labels
- Computer ScienceArXiv
- 2016
This work is the first to tackle the imbalance problem in multi-label classification with many labels by proposing a novel Representation-based Multi-label Learning with Sampling (RMLS) approach.
Making Classifier Chains Resilient to Class Imbalance
- Computer ScienceACML
- 2018
Two extensions of ECC's basic approach are presented, where a varying number of binary models per label are built and chains of different sizes are constructed in order to improve the exploitation of majority examples with approximately the same computational budget.
Tackling Multilabel Imbalance through Label Decoupling and Data Resampling Hybridization
- Computer ScienceNeurocomputing
- 2019
MLSMOTE: Approaching imbalanced multilabel learning through synthetic instance generation
- Computer ScienceKnowl. Based Syst.
- 2015
Inverse random under sampling for class imbalance problem and its application to multi-label classification
- Computer SciencePattern Recognit.
- 2012
Dealing with Difficult Minority Labels in Imbalanced Mutilabel Data Sets
- Computer ScienceNeurocomputing
- 2019
Addressing Imbalance in Multi-Label Classification Using Structured Hellinger Forests
- Computer ScienceAAAI
- 2017
This work introduces an extension of structured forests, a type of random forest used for structured prediction, called Sparse Oblique Structured Hellinger Forests (SOSHF), and proposes a new imbalance-aware formulation by altering how the splitting functions are learned in two ways.
On the Stratification of Multi-label Data
- Computer ScienceECML/PKDD
- 2011
This paper considers two stratification methods for multi- label data and empirically compares them along with random sampling on a number of datasets and reveals some interesting conclusions with respect to the utility of each method for particular types of multi-label datasets.
Addressing class-imbalance in multi-label learning via two-stage multi-label hypernetwork
- Computer ScienceNeurocomputing
- 2017
A First Approach to Deal with Imbalance in Multi-label Datasets
- Computer ScienceHAIS
- 2013
The process of learning from imbalanced datasets has been deeply studied for binary and multi-class classification, but the proposals on how to measure and deal with imbalanced dataset in multi-label classification are scarce.