Differentially Private Counterfactuals via Functional Mechanism

  title={Differentially Private Counterfactuals via Functional Mechanism},
  author={Fan Yang and Qizhang Feng and Kaixiong Zhou and Jiahao Chen and Xia Hu},
Counterfactual , serving as one emerging type of model explanation, has attracted tons of attentions recently from both industry and academia. Different from the conventional feature-based explanations (e.g., attributions ), counterfactuals are a series of hypothetical samples which can flip model decisions with minimal perturbations on queries. Given valid counterfactuals, humans are capable of reasoning under “what-if” circumstances, so as to better understand the model decision boundaries… 

Figures and Tables from this paper



Model-Based Counterfactual Synthesizer for Interpretation

This work first analyze the model-based counterfactual process and construct a base synthesizer using a conditional generative adversarial net (CGAN), and enhances the MCS framework by incorporating the causal dependence among attributes with model inductive bias, and validate its design correctness from the causality identification perspective.

Learning Model-Agnostic Counterfactual Explanations for Tabular Data

A framework, called C-CHVAE, is developed, drawing ideas from the manifold learning literature, that generates faithful counterfactuals and is suggested to complement the catalog ofcounterfactual quality measures using a criterion to quantify the degree of difficulty for a certainCounterfactual suggestion.

Explaining machine learning classifiers through diverse counterfactual explanations

This work proposes a framework for generating and evaluating a diverse set of counterfactual explanations based on determinantal point processes, and provides metrics that enable comparison ofcounterfactual-based methods to other local explanation methods.

Consequence-aware Sequential Counterfactual Generation

This work forms the task as a multi-objective optimization problem and presents a genetic algorithm approach to find optimal sequences of actions leading to the counterfactuals, and proposes a model-agnostic method for sequentialcounterfactual generation.

Good Counterfactuals and Where to Find Them: A Case-Based Technique for Generating Counterfactuals for Explainable AI (XAI)

This work proposes a new case based approach for generating counterfactuals using novel ideas about thecounterfactual potential and explanatory coverage of a case-base, and shows how this technique can improve the counterfactUAL potential and explanations of case-bases that were previously found wanting.

Measurable Counterfactual Local Explanations for Any Classifier

A novel method for explaining the predictions of any classifier by using regression to generate local explanations and a definition of fidelity to the underlying classifier for local explanation models which is based on distances to a target decision boundary is introduced.

Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting

The effect that overfitting and influence have on the ability of an attacker to learn information about the training data from machine learning models, either through training set membership inference or attribute inference attacks is examined.

Functional Mechanism: Regression Analysis under Differential Privacy

The main idea is to enforce e-differential privacy by perturbing the objective function of the optimization problem, rather than its results, and it significantly outperforms existing solutions.

Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR

It is suggested data controllers should offer a particular type of explanation, unconditional counterfactual explanations, to support these three aims, which describe the smallest change to the world that can be made to obtain a desirable outcome, or to arrive at the closest possible world, without needing to explain the internal logic of the system.

On the Privacy Risks of Model Explanations

It is shown that backpropagation-based explanations can leak a significant amount of information about individual training datapoints, because they reveal statistical information about the decision boundaries of the model about an input, which can reveal its membership.