• Corpus ID: 238583210

Certifying Robustness to Programmable Data Bias in Decision Trees

@inproceedings{Meyer2021CertifyingRT,
  title={Certifying Robustness to Programmable Data Bias in Decision Trees},
  author={Anna P. Meyer and Aws Albarghouthi and Loris D'antoni},
  booktitle={Neural Information Processing Systems},
  year={2021}
}
Datasets can be biased due to societal inequities, human biases, underrepresentation of minorities, etc. Our goal is to certify that models produced by a learning algorithm are pointwise-robust to potential dataset biases. This is a challenging problem: it entails learning models for a large, or even infinite, number of datasets, ensuring that they all produce the same prediction. We focus on decision-tree learning due to the interpretable nature of the models. Our approach allows… 

Figures and Tables from this paper

Certifying Data-Bias Robustness in Linear Regression

This work presents a technique for certifying whether linear regression models are pointwise-robust to label bias in the training dataset, i.e., whether bounded perturbations to the labels of a training dataset result in models that change the prediction of test points.

FARE: P ROVABLY F AIR R EPRESENTATION L EARNING

  • Computer Science
  • 2022
This work proposes Fairness with Restricted Encoders (FARE), the first FRL method with provable fairness guarantees and develops and applies a practical statistical procedure that computes a high-confidence upper bound on the unfairness of any downstream classifier.

FARE: Provably Fair Representation Learning

This work proposes Fairness with Restricted Encoders (FARE), the first FRL method with provable fairness guarantees, and develops and applies a practical procedure that computes a high-confidence upper bound on the unfairness of any downstream classi fier.

BagFlip: A Certified Defense against Data Poisoning

BagFlip is presented, a model-agnostic certified approach that can effectively defend against both trigger-less and backdoor attacks and is equal to or more effective than the state-of-the-art approaches fortrigger-less attacks and more effective for backdoor attacks.

Crab: Learning Certifiably Fair Predictive Models in the Presence of Selection Bias

This research shows that C RAB -MX not only achieves performance comparable to the baselines but also allows perfect fairness by achieving zero equal opportunity difference.

Proving Data-Poisoning Robustness in Decision Trees

This work presents a sound verification technique based on abstract interpretation and implements it in a tool called Antidote, which abstractly trains decision trees for an intractably large space of possible poisoned datasets and can produce proofs that the corresponding prediction would not have changed had the training set been tampered with or not.

References

SHOWING 1-10 OF 41 REFERENCES

Proving data-poisoning robustness in decision trees

This work presents a sound verification technique based on abstract interpretation and implements it in a tool called Antidote, which abstractly trains decision trees for an intractably large space of possible poisoned datasets and can produce proofs that, for a given input, the corresponding prediction would not have changed had the training set been tampered with or not.

Certified Robustness to Label-Flipping Attacks via Randomized Smoothing

This work presents a unifying view of randomized smoothing over arbitrary functions, and uses this novel characterization to propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.

Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks

This work proves the intrinsic certified robustness of bagging against data poisoning attacks and shows that bagging with an arbitrary base learning algorithm provably predicts the same label for a testing example when the number of modified, deleted, and/or inserted training examples is bounded by a threshold.

Sever: A Robust Meta-Algorithm for Stochastic Optimization

This work introduces a new meta-algorithm that can take in a base learner such as least squares or stochastic gradient descent, and harden the learner to be resistant to outliers, and finds that in both cases it has substantially greater robustness than several baselines.

Technical note: Bias and the quantification of stability

This paper introduces a method for quantifying stability, based on a measure of the agreement between concepts, and discusses the relationships among stability, predictive accuracy, and bias.

Provably Robust Boosted Decision Stumps and Trees against Adversarial Attacks

This paper shows how to efficiently calculate and optimize an upper bound on the robust loss, which leads to state-of-the-art robust test error for boosted trees on MNIST (12.5% for $\epsilon_\infty=0.3$), FMNIST, and CIFAR-10 (74.7%).

Ensuring Fairness Beyond the Training Data

This work develops classifiers that are fair not only with respect to the training distribution, but also for a class of distributions that are weighted perturbations of the training samples.

Robust Decision Trees Against Adversarial Examples

The proposed algorithms can substantially improve the robustness of tree-based models against adversarial examples and present efficient implementations for classical information gain based trees as well as state-of-the-art tree boosting models such as XGBoost.

Decision Tree Instability and Active Learning

A new measure of decision tree stability is introduced, and three aspects of active learning stability are defined, which are found to improve the stability and accuracy of C4.5 in the active learning setting.

Robustness meets algorithms

This work gives the first efficient algorithm for estimating the parameters of a high-dimensional Gaussian that is able to tolerate a constant fraction of corruptions that is independent of the dimension.