Density-based weighting for imbalanced regression
@article{Steininger2021DensitybasedWF, title={Density-based weighting for imbalanced regression}, author={Michael Steininger and Konstantin Kobs and Padraig Davidson and Anna Krause and Andreas Hotho}, journal={Machine Learning}, year={2021}, volume={110}, pages={2187 - 2211} }
In many real world settings, imbalanced data impedes model performance of learning algorithms, like neural networks, mostly for rare cases. This is especially problematic for tasks focusing on these rare occurrences. For example, when estimating precipitation, extreme rainfall events are scarce but important considering their potential consequences. While there are numerous well studied solutions for classification settings, most of them cannot be applied to regression easily. Of the few…
18 Citations
RankSim: Ranking Similarity Regularization for Deep Imbalanced Regression
- Computer ScienceICML
- 2022
RankSim is complementary to conventional imbalanced learning techniques, including re-weighting, two-stage training, and distribution smoothing, and lifts the state-of-the-art performance on three imbalanced regression benchmarks: IMDB-WIKI-dir, AgeDB-DIR, and STS-B-DIR.
Balanced MSE for Imbalanced Visual Regression
- Computer Science2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2022
This work identifies that the widely used Mean Square Error (MSE) loss function can be ineffective in imbalanced regression and proposes a novel loss function, Balanced MSE, to accommodate the imbalanced training label distribution.
Anomaly Detection using Contrastive Normalizing Flows
- Computer ScienceArXiv
- 2022
This work proposes to use an unlabelled auxiliary dataset and a probabilistic outlier score for anomaly detection and believes that the contrastive normalizing flow can be used for various applications outside of anomaly detection.
Comparing Multiple Linear Regression, Deep Learning and Multiple Perceptron for Functional Points Estimation
- Computer ScienceIEEE Access
- 2022
Both the Pytorch-based Deep Learning and Multiple Perceptron model outperformed Multiple Linear Regression and baseline models using the experimental dataset and in the studied dataset, Adjusted Function Points may not contribute to higher accuracy than Function Point Categories.
Two-Stage Fine-Tuning: A Novel Strategy for Learning Class-Imbalanced Data
- Computer ScienceArXiv
- 2022
A two-stage fine-tuning is proposed: first fine-tune the final layer of the pretrained model with class-balanced reweighting loss, and then the standard fine- Tuning is performed, which allows the model to learn an initial representation of the specific task.
Taming the Long Tail of Deep Probabilistic Forecasting
- Computer ScienceArXiv
- 2022
This work identifies a long tail behavior in the performance of state-of-the-art deep learning methods on probabilistic forecasting and presents two moment-based tailedness measurement concepts to improve performance on the difficult tail examples: Pareto Loss and Kurtosis Loss.
Affective Retrofitted Word Embeddings
- Computer Science
- 2022
A novel retrofitting method that learns a non-linear transformation function that maps pre-trained embeddings to an affective vector space, in a representation learning setting, which achieves better inter-clusters and intra-cluster distance for words having the same emotions, as evaluated through different cluster quality metrics.
Affective Retrofitted Word Embeddings
- Computer ScienceAACL
- 2022
A novel retrofitting method that learns a non-linear transformation function that maps pre-trained embeddings to an affective vector space, in a representation learning setting, which achieves better inter-clusters and intra-cluster distance for words having the same emotions, as evaluated through different cluster quality metrics.
An Adaptive Sampling Framework for Life Cycle Degradation Monitoring
- Computer ScienceSensors
- 2023
An adaptive sampling framework of segment intervals is proposed, based on the summary and improvement of existing problems, to monitor mechanical degradation, and the results are closely related to data status and degradation indicators.
Variation-based Cause Effect Identification
- Computer ScienceArXiv
- 2022
A variation-based cause effect identification framework for causal discovery in bivariate systems from a single observational setting, which relies on the principle of independence of cause and mechanism (ICM) under the assumption of an existing acyclic causal link, and offers a practical realization of this principle.
References
SHOWING 1-10 OF 31 REFERENCES
Imbalanced regression and extreme value prediction
- Computer ScienceMachine Learning
- 2020
This paper proposes SERA, a new evaluation metric capable of assessing the effectiveness and of optimising models towards the prediction of extreme values while penalising severe model bias.
SMOGN: a Pre-processing Approach for Imbalanced Regression
- Computer ScienceLIDTA@PKDD/ECML
- 2017
The algorithm, SMOGN, is proposed, which incorporates two existing proposals trying to solve problems detected in both of them and has advantages in comparison to other approaches, and is shown to have a different impact on the learners used.
SMOTE for Regression
- Computer ScienceEPIA
- 2013
A modification of the well-known Smote algorithm that allows its use on these regression tasks by changing the distribution of the given training data set to decrease the problem of imbalance between the rare target cases and the most frequent ones.
Kernel density estimation based sampling for imbalanced class distribution
- Computer ScienceInf. Sci.
- 2020
A Survey of Predictive Modeling on Imbalanced Domains
- Computer ScienceACM Comput. Surv.
- 2016
The main challenges raised by imbalanced domains are discussed, a definition of the problem is proposed, the main approaches to these tasks are described, and a taxonomy of the methods are proposed.
Learning Deep Representation for Imbalanced Classification
- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016
The representation learned by this approach, when combined with a simple k-nearest neighbor (kNN) algorithm, shows significant improvements over existing methods on both high- and low-level vision classification tasks that exhibit imbalanced class distribution.
Class-Balanced Loss Based on Effective Number of Samples
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
This work designs a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss and introduces a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point.
Learning from imbalanced data: open challenges and future directions
- Computer ScienceProgress in Artificial Intelligence
- 2016
Seven vital areas of research in this topic are identified, covering the full spectrum of learning from imbalanced data: classification, regression, clustering, data streams, big data analytics and applications, e.g., in social media and computer vision.
An extended tuning method for cost-sensitive regression and forecasting
- Computer ScienceDecis. Support Syst.
- 2011
ADASYN: Adaptive synthetic sampling approach for imbalanced learning
- Computer Science2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)
- 2008
Simulation analyses on several machine learning data sets show the effectiveness of the ADASYN sampling approach across five evaluation metrics.