• Corpus ID: 49563386

Learning under selective labels in the presence of expert consistency

@article{DeArteaga2018LearningUS,
  title={Learning under selective labels in the presence of expert consistency},
  author={Maria De-Arteaga and Artur W. Dubrawski and Alexandra Chouldechova},
  journal={ArXiv},
  year={2018},
  volume={abs/1807.00905}
}
We explore the problem of learning under selective labels in the context of algorithm-assisted decision making. Selective labels is a pervasive selection bias problem that arises when historical decision making blinds us to the true outcome for certain instances. Examples of this are common in many applications, ranging from predicting recidivism using pre-trial release data to diagnosing patients. In this paper we discuss why selective labels often cannot be effectively tackled by standard… 

Figures from this paper

Optimal Policies for the Homogeneous Selective Labels Problem

TLDR
This paper reports work in progress on learning decision policies in the face of selective labels, both a simplified homogeneous one, disregarding individuals' features to facilitate determination of optimal policies, and an online one, to balance costs incurred in learning with future utility.

Fairness Through Causal Awareness: Learning Latent-Variable Models for Biased Data

TLDR
It is shown experimentally that fairness-aware causal modeling provides better estimates of the causal effects between the sensitive attribute, the treatment, and the outcome, and further present evidence that estimating these causal effects can help learn policies that are both more accurate and fair, when presented with a historically biased dataset.

Recovering from Biased Data: Can Fairness Constraints Improve Accuracy?

TLDR
The Equal Opportunity fairness constraint combined with ERM will provably recover the Bayes Optimal Classifier under a range of bias models, and theoretical results provide additional motivation for considering fairness interventions even if an actor cares primarily about accuracy.

On the Impossibility of Fairness-Aware Learning from Corrupted Data

TLDR
It is proved that there are situations in which an adversary can force any learner to return a biased classifier, with or without degrading accuracy, and that the strength of this bias increases for learning problems with underrepresented protected groups in the data.

Fairness through Causal Awareness: Learning Causal Latent-Variable Models for Biased Data

TLDR
It is shown experimentally that fairness-aware causal modeling provides better estimates of the causal effects between the sensitive attribute, the treatment, and the outcome, and further present evidence that estimating these causal effects can help learn policies that are both more accurate and fair, when presented with a historically biased dataset.

Decision-Making Under Selective Labels: Optimal Finite-Domain Policies and Beyond

TLDR
This paper studies the learning of decision policies in the face of selective labels, in an online setting that balances learning costs against future utility, and proposes policies that achieve consistently superior utility with no parameter tuning in the finite-domain case and lower parameter sensitivity in the general case.

Bias In, Bias Out? Evaluating the Folk Wisdom

TLDR
Whether a prediction algorithm reverses or inherits bias depends critically on how the decision-maker affects the training data as well as the label used in training.

Leveraging Expert Consistency to Improve Algorithmic Decision Support

TLDR
This work considers the problem of estimating expert consistency indirectly when each case in the data is assessed by a single expert, and proposes influence function-based methodology as a solution to this problem, and incorporates the estimated expert consistency into a predictive model through a training-time label amalgamation approach.

Fairness-Aware PAC Learning from Corrupted Data

TLDR
This work considers fairness-aware learning under worstcase data manipulations and studies two natural learning algorithms that optimize for both accuracy and fairness and shows that these algorithms enjoy guarantees that are order-optimal in terms of the corruption ratio and the protected groups frequencies in the large data limit.

Fairness-Aware Learning from Corrupted Data

TLDR
It is shown that an adversary can force any learner to return a biased classifier, with or without degrading accuracy, and that the strength of this bias increases for learning problems with underrepresented protected groups in the data.

References

SHOWING 1-10 OF 16 REFERENCES

The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables

TLDR
This work develops an approach called contraction which allows us to compare the performance of predictive models and human decision-makers without resorting to counterfactual inference and demonstrates the utility of the evaluation metric in comparing human decisions and machine predictions.

Predict Responsibly: Increasing Fairness by Learning To Defer

TLDR
Experiments on real-world datasets demonstrate that learning to defer can make a model not only more accurate but also less biased, and it is shown that deferring models can still greatly improve the fairness of the entire pipeline.

Residual Unfairness in Fair Machine Learning from Prejudiced Data

TLDR
It is proved that, under certain conditions, fairness-adjusted classifiers will in fact induce residual unfairness that perpetuates the same injustices, against the same groups, that biased the data to begin with, thus showing that even state-of-the-art fair machine learning can have a "bias in, bias out" property.

Correcting Sample Selection Bias by Unlabeled Data

TLDR
A nonparametric method which directly produces resampling weights without distribution estimation is presented, which works by matching distributions between training and testing sets in feature space.

Human Decisions and Machine Predictions

TLDR
While machine learning can be valuable, realizing this value requires integrating these tools into an economic framework: being clear about the link between predictions and decisions; specifying the scope of payoff functions; and constructing unbiased decision counterfactuals.

Review of inverse probability weighting for dealing with missing data

TLDR
How the bias in the complete-case analysis arises and how IPW can remove it is described and why, despite MI generally being more efficient, IPW may sometimes be preferred is explained.

Learning with Rejection

TLDR
A novel framework for classification with a rejected option that consists of simultaneously learning two functions: a classifier along with a rejection function and the results of several experiments are reported showing that the kernel-based algorithms can yield a notable improvement over the best existing confidence-based rejection algorithm.

Statistical Analysis With Missing Data

  • N. Lazar
  • Computer Science
    Technometrics
  • 2003
TLDR
Generalized Estimating Equations is a good introductory book for analyzing continuous and discrete correlated data using GEE methods and provides good guidance for analyzing correlated data in biomedical studies and survey studies.