Neo: Generalizing Confusion Matrix Visualization to Hierarchical and Multi-Output Labels

@article{Gortler2021NeoGC,
  title={Neo: Generalizing Confusion Matrix Visualization to Hierarchical and Multi-Output Labels},
  author={Jochen Gortler and Fred Hohman and Dominik Moritz and Kanit Wongsuphasawat and Donghao Ren and Rahul Nair and Marc Kirchner and Kayur Patel},
  journal={Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems},
  year={2021}
}
The confusion matrix, a ubiquitous visualization for helping people evaluate machine learning models, is a tabular layout that compares predicted class labels against actual class labels over all data instances. We conduct formative research with machine learning practitioners at Apple and find that conventional confusion matrices do not support more complex data-structures found in modern-day applications, such as hierarchical and multi-output labels. To express such variations of confusion… 

Figures from this paper

Calibrate: Interactive Analysis of Probabilistic Model Output

Calibrate constructs a reliability diagram that is resistant to drawbacks in traditional approaches, and allows for interactive subgroup analysis and instance-level inspection, and is demonstrated the utility of Calibrate through use cases on both real-world and synthetic data.

Visualization for Machine Learning

This tutorial seeks to provide a foundational understanding of the ways in which one can use visualization for machine learning tasks, in particular, visual techniques for model assessment, model understanding, and dimensionality reduction.

Horses to Zebras: Ontology-Guided Data Augmentation and Synthesis for ICD-9 Coding

An analysis technique for this setting inspired by confusion matrices is introduced that points to the positive impact of data augmentation and synthesis, but also highlights more general issues of confusion within families of codes, and underprediction.

No Grammar to Rule Them All: A Survey of JSON-style DSLs for Visualization

  • Andrew M. McNutt
  • Computer Science
    IEEE Transactions on Visualization and Computer Graphics
  • 2023
This work surveys and analyzes the design and implementation of 57 JSON-style DSLs for visualization and identifies tensions throughout these areas, such as between formal and colloquial specifications, among types of users, and within the composition of languages.

What Did My AI Learn? How Data Scientists Make Sense of Model Behavior

AIFinnity, a system for analyzing image-and-text models, was found to have a sensemaking workflow that reflected participants’ mental processes and enabled them to discover and validate diverse AI behaviors.

A Stratification Matrix Viewer for Analysis of Neural Network Data

A framework for interactive, visually supported accomplishment of tasks to understand the cellular composition of a reconstructed neural tissue volume to determine the nodes of the brain network, and quantify connectivity features statistically; and compare these to predictions of mathematical models is presented.

TransCluster: A Cell-Type Identification Method for single-cell RNA-Seq data using deep learning based on transformer

This work proposes a hybrid network structure called TransCluster, which uses linear discriminant analysis and a modified Transformer to enhance feature learning and is the first attempt to use Transformer for annotating cell types of scRNA-seq, which greatly improves the accuracy of cell-type identification.

Designing Data: Proactive Data Collection and Iteration for Machine Learning

Diversity in data collection has caused significant failures in machine learning applications, so new methods to track & manage data collection, iteration and post-collection interventions are needed.

Relative Confusion Matrix: Efficient Comparison of Decision Models

The Relative Confusion Matrix is presented, a new matrix visualization that leverages Confusion matrices and a color encoding to expose the class-wise differences of performances between two models and results show that RCM encoding leads to a more efficient comparison of two models than existing approaches.

References

SHOWING 1-10 OF 44 REFERENCES

Designing Alternative Representations of Confusion Matrices to Support Non-Expert Public Understanding of Algorithm Performance

This work redesigns confusion matrices for binary classification to support non-experts in understanding the performance of machine learning models, and suggests that only by contextualizing terminologies can the authors significantly improve users' understanding.

ConfusionFlow: A Model-Agnostic Visualization for Temporal Analysis of Classifier Confusion

ConfusionFlow is an interactive, comparative visualization tool that combines the benefits of class confusion matrices with the visualization of performance characteristics over time and is model-agnostic and can be used to compare performances for different model types, model architectures, and/or training and test datasets.

Understanding and Visualizing Data Iteration in Machine Learning

This work designs a collection of interactive visualizations and integrates them into a prototype, Chameleon, that lets users compare data features, training/testing splits, and performance across data versions and identifies opportunities for future data iterations.

EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers

EnsembleMatrix is an interactive visualization system that presents a graphical view of confusion matrices to help users understand relative merits of various classifiers and allows users to directly interact with the visualizations in order to explore and build combination models.

Squares: Supporting Interactive Performance Analysis for Multiclass Classifiers

Squares is presented, a performance visualization for multiclass classification problems that supports estimating common performance metrics while displaying instance-level distribution information necessary for helping practitioners prioritize efforts and access data.

Interactive optimization for steering machine classification

ManiMatrix is presented, a system that provides controls and visualizations that enable system builders to refine the behavior of classification systems in an intuitive manner and results show that users are able to quickly and effectively modify decision boundaries of classifiers to tai-lor thebehavior of classifier to problems at hand.

A Novel Visualization Approach for Data-Mining-Related Classification

  • C. SeifertE. Lex
  • Computer Science
    2009 13th International Conference Information Visualisation
  • 2009
An intuitive visualization system to observe and understand classification processes and results that can handle multiple classes, nominal and numeric attributes, and supports all classifiers whose predictions can be interpreted as probabilities is proposed.

Grounding Interactive Machine Learning Tool Design in How Non-Experts Actually Build Models

This work investigated how non-experts build ML solutions for themselves in real life and suggested that, while challenging, making ML easy and robust should both be important goals of designing novice-facing ML tools.

Polaris: A System for Query, Analysis, and Visualization of Multidimensional Relational Databases

Polaris is presented, an interface for exploring large multidimensional databases that extends the well-known pivot table interface that includes an interfaces for constructing visual specifications of table-based graphical displays and the ability to generate a precise set of relational queries from the visual specifications.

ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models

ActiVis is developed, deployed, and iteratively improved, an interactive visualization system for interpreting large-scale deep learning models and results and can explore complex deep neural network models at both the instance-and subset-level.