# Nonparametric semi-supervised learning of class proportions

@article{Jain2016NonparametricSL, title={Nonparametric semi-supervised learning of class proportions}, author={Shantanu Jain and Martha White and Michael W. Trosset and Predrag Radivojac}, journal={ArXiv}, year={2016}, volume={abs/1601.01944} }

The problem of developing binary classifiers from positive and unlabeled data is often encountered in machine learning. A common requirement in this setting is to approximate posterior probabilities of positive and negative classes for a previously unseen data point. This problem can be decomposed into two steps: (i) the development of accurate predictors that discriminate between positive and unlabeled data, and (ii) the accurate estimation of the prior probabilities of positive and negative…

## 50 Citations

### Fast Nonparametric Estimation of Class Proportions in the Positive-Unlabeled Classification Setting

- Computer ScienceAAAI
- 2020

This work proposes an intuitive and fast nonparametric algorithm to estimate class proportions and shows that the point of sharp increase in the recorded distances corresponds to the desired proportion of positives in the unlabeled set and trains a deep neural network to identify that point.

### Estimating the class prior and posterior from noisy positives and unlabeled data

- Computer ScienceNIPS
- 2016

This work develops a classification algorithm for estimating posterior distributions from positive-unlabeled data, that is robust to noise in the positive labels and effective for high-dimensional data and proves that these univariate transforms preserve the class prior.

### Class Prior Estimation with Biased Positives and Unlabeled Examples

- MathematicsAAAI
- 2020

This work starts by making a set of assumptions to model the sampling bias, and derives an algorithm for estimating the class priors that relies on clustering to decompose the original problem into subproblems of unbiased positive-unlabeled learning.

### A Variational Approach for Learning from Positive and Unlabeled Data

- Computer ScienceNeurIPS
- 2020

A variational principle for PU learning is introduced that allows us to quantitatively evaluate the modeling error of the Bayesian classifier directly from given data, which leads to a loss function which can be efficiently calculated without any intermediate step or model.

### DEDPUL: Difference-of-Estimated-Densities-based Positive-Unlabeled Learning

- Computer Science
- 2019

The mechanism behind DEDPUL is to apply a computationally cheap post-processing procedure to the predictions of any classifier trained to distinguish positive and unlabeled data, which outperforms the current state-of-the-art in both proportion estimation and PU Classification.

### DEDPUL: Difference-of-Estimated-Densities-based Positive-Unlabeled Learning

- Computer Science2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA)
- 2020

The mechanism behind DEDPUL is to apply a computationally cheap postprocessing procedure to the predictions of any classifier trained to distinguish positive and unlabeled data and outperforms the current state-of-the-art in both proportion estimation and PU Classification and is flexible in the choice of the classifier.

### Recovering True Classifier Performance in Positive-Unlabeled Learning

- Computer ScienceAAAI
- 2017

This work shows that the typically used performance measures such as the receiver operating characteristic curve, or the precision recall curve obtained on such data can be corrected with the knowledge of class priors; i.e., the proportions of the positive and negative examples in the unlabeled data.

### Class Prior Estimation in Active Positive and Unlabeled Learning

- Computer ScienceIJCAI
- 2020

This paper explores how to tackle the problem of estimating the proportion of positive examples from positive and unlabeled data when the observed labels were acquired via active learning, and designs an algorithm that is able to estimate the class prior for a given active learning strategy.

### Mixture Proportion Estimation and PU Learning: A Modern Approach

- Computer ScienceNeurIPS
- 2021

Two simple techniques are proposed: Best Bin Estimation (BBE) (for MPE); and Conditional Value Ignoring Risk (CVIR), a simple objective for PU-learning, which establish formal guarantees that hold whenever a model can cleanly separate out a small subset of positive examples.

### Alternate Estimation of a Classifier and the Class-Prior from Positive and Unlabeled Data

- Mathematics, Computer ScienceArXiv
- 2018

This paper proposes a novel unified approach to estimating the class-prior and training a classifier alternately, which is simple to implement and computationally efficient.

## References

SHOWING 1-10 OF 42 REFERENCES

### Analysis of Learning from Positive and Unlabeled Data

- Computer ScienceNIPS
- 2014

This paper first shows that this problem can be solved by cost-sensitive learning between positive and unlabeled data, and shows that convex surrogate loss functions such as the hinge loss may lead to a wrong classification boundary due to an intrinsic bias, but this can be avoided by using non-convex loss functionssuch as the ramp loss.

### Class Prior Estimation from Positive and Unlabeled Data

- Mathematics, Computer ScienceIEICE Trans. Inf. Syst.
- 2014

A new method to estimate the class prior by partially matching the class-conditional density of the positive class to the input density and performing this partial matching in terms of the Pearson divergence is proposed.

### Semi-Supervised Novelty Detection

- Computer ScienceJ. Mach. Learn. Res.
- 2010

It is argued that novelty detection in this semi-supervised setting is naturally solved by a general reduction to a binary classification problem and provides a general solution to the general two-sample problem, that is, the problem of determining whether two random samples arise from the same distribution.

### Learning from positive and unlabeled examples by enforcing statistical significance

- Computer ScienceAISTATS
- 2011

This work formalizes the problem of characterizing the positive class as a problem of learning a feature based score function that minimizes the p-value of a non parametric statistical hypothesis test and provides a solution of this problem computed by a one-class SVM applied on a surrogate dataset obtained.

### Learning classifiers from only positive and unlabeled data

- Computer ScienceKDD
- 2008

This paper shows that models trained using the new methods perform better than the current state-of-the-art biased SVM method for learning from positive and unlabeled examples, and applies them to solve a real-world problem: identifying protein records that should be included in an incomplete specialized molecular biology database.

### Novelty detection: Unlabeled data definitely help

- Computer ScienceAISTATS
- 2009

This talk considers the setting where an unlabeled and possibly contaminated sample is also available at learning time, and argues that novelty detection in this semi-supervised setting is naturally solved by a general reduction to a binary classification problem.

### Classification on Data with Biased Class Distribution

- Computer ScienceECML
- 2001

Applications of the proposed bootstrap method to estimate its class probabilities to improve classification accuracy on new data to a benchmark data set with various class probabilities on unlabeled data and balanced class probabilitieson the training data provided strong evidence that the proposed methodology can be successfully used to significantly improve classification on unl Isabeled data.

### Classification with Asymmetric Label Noise: Consistency and Maximal Denoising

- Computer Science, MathematicsCOLT
- 2013

This work gives conditions that are necessary and sufficient for the true class-conditional distributions to be identifiable, and argues that this pair corresponds in a certain sense to maximal denoising of the observed distributions.

### The Foundations of Cost-Sensitive Learning

- Computer ScienceIJCAI
- 2001

It is argued that changing the balance of negative and positive training examples has little effect on the classifiers produced by standard Bayesian and decision tree learning methods, and the recommended way of applying one of these methods is to learn a classifier from the training set and then to compute optimal decisions explicitly using the probability estimates given by the classifier.