Corpus ID: 462623

Mind the Gap: A Generative Approach to Interpretable Feature Selection and Extraction

@inproceedings{Kim2015MindTG,
  title={Mind the Gap: A Generative Approach to Interpretable Feature Selection and Extraction},
  author={Been Kim and J. Shah and Finale Doshi-Velez},
  booktitle={NIPS},
  year={2015}
}
We present the Mind the Gap Model (MGM), an approach for interpretable feature extraction and selection. By placing interpretability criteria directly into the model, we allow for the model to both optimize parameters related to interpretability and to directly report a global set of distinguishable dimensions to assist with further data exploration and hypothesis generation. MGM extracts distinguishing features on real-world datasets of animal features, recipes ingredients, and disease co… Expand
Inducing Semantic Grouping of Latent Concepts for Explanations: An Ante-Hoc Approach
Self-explainable deep models are devised to represent the hidden concepts in the dataset without requiring any posthoc explanation generation technique. We worked with one of such models motivated byExpand
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)
TLDR
Concept Activation Vectors (CAVs) are introduced, which provide an interpretation of a neural net's internal state in terms of human-friendly concepts, and may be used to explore hypotheses and generate insights for a standard image classification network as well as a medical application. Expand
Towards Ground Truth Explainability on Tabular Data
TLDR
Copulas, a concise specification of the desired statistical properties of a dataset, so that users can build intuition around explainability using controlled data sets and experimentation, is proposed for tabular data. Expand
Considerations for Evaluation and Generalization in Interpretable Machine Learning
TLDR
A definitions of interpretability are discussed and described and when interpretability is needed (and when it is not) and a taxonomy for rigorous evaluation is described, and recommendations for researchers are made. Expand
Rationalization through Concepts
TLDR
Experiments show that ConRAT is the first to generate concepts that align with human rationalization while using only the overall label, and it outperforms state-of-the-art methods trained on each aspect label independently. Expand
Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models
TLDR
The design and implementation of an interactive visual analytics system, Prospector, that provides interactive partial dependence diagnostics and support for localized inspection allows data scientists to understand how and why specific datapoints are predicted as they are. Expand
On Validating, Repairing and Refining Heuristic ML Explanations
TLDR
Earlier work to the case of boosted trees is extended and the quality of explanations obtained with state-of-the-art heuristic approaches are assessed. Expand
Building Interpretable Models: From Bayesian Networks to Neural Networks
TLDR
This dissertation explores the design of interpretable models based on Bayesian networks, sum-product networks and neural networks, and develops a novel method, Selective Bayesian Forest Classifier (SBFC), that strikes a balance between predictive power and interpretability. Expand
Notions of explainability and evaluation approaches for explainable artificial intelligence
TLDR
This systematic review contributes to the body of knowledge by clustering all the scientific studies via a hierarchical system that classifies theories and notions related to the concept of explainability and the evaluation approaches for XAI methods. Expand
Towards A Rigorous Science of Interpretable Machine Learning
TLDR
This position paper defines interpretability and describes when interpretability is needed (and when it is not), and suggests a taxonomy for rigorous evaluation and exposes open questions towards a more rigorous science of interpretable machine learning. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 34 REFERENCES
The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification
TLDR
The Bayesian Case Model is presented, a general framework for Bayesian case-based reasoning (CBR) and prototype classification and clustering and subspace representation that provides quantitative benefits in interpretability while preserving classification accuracy. Expand
Feature Selection for Unsupervised Learning
TLDR
This paper explores the feature selection problem and issues through FSSEM (Feature Subset Selection using Expectation-Maximization (EM) clustering) and through two different performance criteria for evaluating candidate feature subsets: scatter separability and maximum likelihood. Expand
Contrastive Learning Using Spectral Methods
TLDR
This paper formalizes this notion of contrastive learning for mixture models, and develops spectral algorithms for inferring mixture components specific to a foreground data set when contrasted with a background data set. Expand
Comprehensible classification models: a position paper
TLDR
This paper discusses the interpretability of five types of classification models, namely decision trees, classification rules, decision tables, nearest neighbors and Bayesian network classifiers, and the drawbacks of using model size as the only criterion to evaluate the comprehensibility of a model. Expand
Learning Determinantal Point Processes
TLDR
This thesis shows how determinantal point processes can be used as probabilistic models for binary structured problems characterized by global, negative interactions, and demonstrates experimentally that the techniques introduced allow DPPs to be used for real-world tasks like document summarization, multiple human pose estimation, search diversification, and the threading of large document collections. Expand
Clustering with the Fisher Score
TLDR
This paper develops a novel but simple clustering algorithm specialized for the Fisher score, which can exploit important dimensions and is successfully tested in experiments with artificial data and real data. Expand
Fully Sparse Topic Models
TLDR
This paper shows that FSTM can perform substantially better than various existing topic models by different performance measures, and provides a principled way to directly trade off sparsity of solutions against inference quality and running time. Expand
Document clustering via dirichlet process mixture model with feature selection
TLDR
This paper proposes a novel approach, namely DPMFS, to group documents into a set of clusters while the number of document clusters is determined by the Dirichlet process mixture model automatically; and to identify the discriminative words and separate them from irrelevant noise words via stochastic search variable selection technique. Expand
Sparse Additive Generative Models of Text
TLDR
This approach has two key advantages: it can enforce sparsity to prevent overfitting, and it can combine generative facets through simple addition in log space, avoiding the need for latent switching variables. Expand
Unsupervised Variable Selection: when random rankings sound as irrelevancy
TLDR
This paper proposes to combine multiple ranking to go ahead toward a stable consensus variable subset in a totally unsupervised fashion. Expand
...
1
2
3
4
...